Understanding the following concepts will help you get the most out of InfluxDB.
What’s in a database? This page gives SQL users an overview of how InfluxDB is like an SQL database and how it’s not. It highlights some of the major distinctions between the two and provides a loose crosswalk between the different database terminologies and query languages.
In general… InfluxDB is designed to work with time-series data. SQL databases can handle time-series but weren’t created strictly for that purpose. In short, InfluxDB is made to store a large volume of time-series data and perform real-time analysis on those data, quickly.
InfluxDB is a time-series database. Optimizing for this use-case entails some tradeoffs, primarily to increase performance at the cost of functionality. Below is a list of some of those design insights that lead to tradeoffs:
For the time series use case, we assume that if the same data is sent multiple times, it is the exact same data that a client just sent several times. Pro: Simplified conflict resolution increases write performance Con: Cannot store duplicate data; may overwrite data in rare circumstances Deletes are a rare occurrence.
aggregation An InfluxQL function that returns an aggregated value across a set of points. See InfluxQL Functions for a complete list of the available and upcoming aggregations.
Related entries: function, selector, transformation
batch A collection of points in line protocol format, separated by newlines (0x0A). A batch of points may be submitted to the database using a single HTTP request to the write endpoint. This makes writes via the HTTP API much more performant by drastically reducing the HTTP overhead.
Before diving into InfluxDB it’s good to get acquainted with some of the key concepts of the database. This document provides a gentle introduction to those concepts and common InfluxDB terminology. We’ve provided a list below of all the terms we’ll cover, but we recommend reading this document from start to finish to gain a more general understanding of our favorite time series database.
database field key field set field value measurement point retention policy series tag key tag set tag value timestamp Check out the Glossary if you prefer the cold, hard facts.
Every InfluxDB use case is special and your schema will reflect that uniqueness. There are, however, general guidelines to follow and pitfalls to avoid when designing your schema.
General Recommendations Encouraged Schema Design Discouraged Schema Design Shard Group Duration Management General Recommendations Encouraged Schema Design In no particular order, we recommend that you:
Encode meta data in tags Tags are indexed and fields are not indexed. This means that queries on tags are more performant than those on fields.
The InfluxDB Storage Engine and the Time-Structured Merge Tree (TSM) The new InfluxDB storage engine looks very similar to a LSM Tree. It has a write ahead log and a collection of read-only data files which are similar in concept to SSTables in an LSM Tree. TSM files contain sorted, compressed series data.
InfluxDB will create a shard for each block of time. For example, if you have a retention policy with an unlimited duration, shards will be created for each 7 day block of time.
Support and feedback
Thank you for being part of our community!
We welcome and encourage your feedback and bug reports for InfluxDB and this documentation.
To find support, the following resources are available:
InfluxDB Cloud and InfluxDB Enterprise customers can contact InfluxData Support.