InfluxDB concepts
This page documents an earlier version of InfluxDB. InfluxDB v2.7 is the latest stable version.
Understanding the following concepts will help you get the most out of InfluxDB.
Comparison to SQL
What’s in a database? This page gives SQL users an overview of how InfluxDB is like an SQL database and how it’s not. It highlights some of the major distinctions between the two and provides a loose crosswalk between the different database terminologies and query languages. In general… InfluxDB is designed to work with time-series data. SQL databases can handle time-series but weren’t created strictly for that purpose. In short, InfluxDB is made to store a large volume of time-series data and perform real-time analysis on those data, quickly.
Design insights and tradeoffs in InfluxDB
InfluxDB is a time-series database. Optimizing for this use-case entails some tradeoffs, primarily to increase performance at the cost of functionality. Below is a list of some of those design insights that lead to tradeoffs: For the time series use case, we assume that if the same data is sent multiple times, it is the exact same data that a client just sent several times. Pro: Simplified conflict resolution increases write performance Con: Cannot store duplicate data; may overwrite data in rare circumstances Deletes are a rare occurrence.
Glossary of terms
aggregation An InfluxQL function that returns an aggregated value across a set of points. See InfluxQL Functions for a complete list of the available and upcoming aggregations. Related entries: function, selector, transformation batch A collection of points in line protocol format, separated by newlines (0x0A). A batch of points may be submitted to the database using a single HTTP request to the write endpoint. This makes writes using the HTTP API significantly more performant by drastically reducing the HTTP overhead.
Key concepts
Before diving into InfluxDB it’s good to get acquainted with some of the key concepts of the database. This document provides a gentle introduction to those concepts and common InfluxDB terminology. We’ve provided a list below of all the terms we’ll cover, but we recommend reading this document from start to finish to gain a more general understanding of our favorite time series database. database field key field set field value measurement point retention policy series tag key tag set tag value timestamp Check out the Glossary if you prefer the cold, hard facts.
Schema design and data layout
Every InfluxDB use case is special and your schema will reflect that uniqueness. There are, however, general guidelines to follow and pitfalls to avoid when designing your schema. General Recommendations Encouraged Schema Design Discouraged Schema Design Shard Group Duration Management General recommendations Encouraged schema design In no particular order, we recommend that you: Encode meta data in tags Tags are indexed and fields are not indexed. This means that queries on tags are more performant than those on fields.
Storage engine and the Time-Structured Merge Tree (TSM)
The InfluxDB storage engine and the Time-Structured Merge Tree (TSM) The new InfluxDB storage engine looks very similar to a LSM Tree. It has a write ahead log and a collection of read-only data files which are similar in concept to SSTables in an LSM Tree. TSM files contain sorted, compressed series data. InfluxDB will create a shard for each block of time. For example, if you have a retention policy with an unlimited duration, shards will be created for each 7 day block of time.
Was this page helpful?
Thank you for your feedback!
Support and feedback
Thank you for being part of our community! We welcome and encourage your feedback and bug reports for InfluxDB and this documentation. To find support, use the following resources:
InfluxDB Cloud and InfluxDB Enterprise customers can contact InfluxData Support.