This page documents an earlier version of InfluxDB. InfluxDB v2.7 is the latest stable version.
Understanding the following concepts will help you get the most out of InfluxDB.
In-memory indexing and the Time-Structured Merge Tree (TSM)
The InfluxDB storage engine and the Time-Structured Merge Tree (TSM) The InfluxDB storage engine looks very similar to a LSM Tree. It has a write ahead log and a collection of read-only data files which are similar in concept to SSTables in an LSM Tree. TSM files contain sorted, compressed series data. InfluxDB will create a shard for each block of time. For example, if you have a retention policy with an unlimited duration, shards will be created for each 7 day block of time.
InfluxDB compared to SQL databases
What’s in a database? This page gives SQL users an overview of how InfluxDB is like an SQL database and how it’s not. It highlights some of the major distinctions between the two and provides a loose crosswalk between the different database terminologies and query languages. In general… InfluxDB is designed to work with time-series data. SQL databases can handle time-series but weren’t created strictly for that purpose. In short, InfluxDB is made to store a large volume of time-series data and perform real-time analysis on those data, quickly.
InfluxDB design insights and tradeoffs
InfluxDB is a time series database. Optimizing for this use case entails some tradeoffs, primarily to increase performance at the cost of functionality. Below is a list of some of those design insights that lead to tradeoffs: For the time series use case, we assume that if the same data is sent multiple times, it is the exact same data that a client just sent several times. Pro: Simplified conflict resolution increases write performance.
InfluxDB glossary of terms
aggregation An InfluxQL function that returns an aggregated value across a set of points. See InfluxQL Functions for a complete list of the available and upcoming aggregations. Related entries: function, selector, transformation batch A collection of points in line protocol format, separated by newlines (0x0A). A batch of points may be submitted to the database using a single HTTP request to the write endpoint. This makes writes via the HTTP API much more performant by drastically reducing the HTTP overhead.
InfluxDB key concepts
Covers key concepts to learn about InfluxDB.
InfluxDB schema design and data layout
Covers general guidelines for InfluxDB schema design and data layout.
Time Series Index (TSI) details
Time Series Index (TSI) description When InfluxDB ingests data, we store not only the value but we also index the measurement and tag information so that it can be queried quickly. In earlier versions, index data could only be stored in-memory, however, that requires a lot of RAM and places an upper bound on the number of series a machine can hold. This upper bound is usually somewhere between 1 - 4 million series depending on the machine used.
Time Series Index (TSI) overview
Time Series Index (TSI) In order to support a large number of time series, that is, a very high cardinality in the number of unique time series that the database stores, InfluxData has added the new Time Series Index (TSI). InfluxData supports customers using InfluxDB with tens of millions of time series. InfluxData’s goal, however, is to expand to hundreds of millions, and eventually billions. Using InfluxData’s TSI storage engine, users should be able to have millions of unique time series.
Was this page helpful?
Thank you for your feedback!
Support and feedback
Thank you for being part of our community! We welcome and encourage your feedback and bug reports for InfluxDB and this documentation. To find support, use the following resources:
InfluxDB Cloud and InfluxDB Enterprise customers can contact InfluxData Support.