InfluxDB shards and shard groups

This page documents an earlier version of InfluxDB OSS. InfluxDB 3 Core is the latest stable version.

API token hashing is enabled by default in InfluxDB OSS 2.9.0

Stronger token security: tokens are stored as hashes on disk, so a copy of the database file doesn’t expose usable tokens. Existing tokens are hashed on first startup and the original strings can’t be recovered afterward — capture any plaintext tokens you still need before you upgrade.

For more information, see Token hashing.

InfluxDB organizes time series data into shards when storing data to disk. Shards are grouped into shard groups. Learn the relationships between buckets, shards, and shard groups.

Shards
Shard groups
- Shard group duration
- Shard group diagram
Shard life cycle
Shard deletion

Shards

A shard contains encoded and compressed time series data for a given time range defined by the shard group duration. All points in a series within the specified shard group duration are stored in the same shard. A single shard contains multiple series, one or more TSM files on disk, and belongs to a shard group.

Shard groups

A shard group belongs to an InfluxDB bucket and contains time series data for a specific time range defined by the shard group duration.

In InfluxDB OSS, a shard group typically contains only a single shard. In an InfluxDB Enterprise 1.x cluster, shard groups contain multiple shards distributed across multiple data nodes.

Shard group duration

The shard group duration specifies the time range for each shard group and determines how often to create a new shard group. By default, InfluxDB sets the shard group duration according to the retention period of the bucket:

Bucket retention period	Default shard group duration
less than 2 days	1h
between 2 days and 6 months	1d
greater than 6 months	7d

Shard group duration configuration options

To configure a custom bucket shard group duration, use the --shard-group-duration flag with the influx bucket create and influx bucket update commands.

Shard group durations must be shorter than the bucket’s retention period.

To view your bucket’s shard group duration, use the influx bucket list command.

Shard group diagram

The following diagram represents a bucket with a 4d retention period and a 1d shard group duration:

Shard group

Shard

Shard group

Shard

Shard group

Shard

Shard group

Shard

Shard life cycle

Shard precreation

The InfluxDB shard precreation service pre-creates shards with future start and end times for each shard group based on the shard group duration.

The precreator service does not pre-create shards for past time ranges. When backfilling historical data, InfluxDB creates shards for past time ranges as needed, resulting in temporarily lower write throughput.

Shard writes

InfluxDB writes time series data to un-compacted or “hot” shards. When a shard is no longer actively written to, InfluxDB compacts shard data, resulting in a “cold” shard.

Typically, InfluxDB writes data to the most recent shard group, but when backfilling historical data, InfluxDB writes to older shards that must first be un-compacted. When the backfill is complete, InfluxDB re-compacts the older shards.

Shard compaction

InfluxDB compacts shards at regular intervals to compress time series data and optimize disk usage. When compactions are enabled, InfluxDB checks to see whether shard compactions are needed every second. If there haven’t been writes during the compact-full-write-cold-duration period (by default, 4h), InfluxDB compacts all TSM files. Otherwise, InfluxDB groups TSM files into compaction levels (determined by the number of times the file have been compacted), and attempts to combine files and compress them more efficiently.

InfluxDB uses the following four compaction levels:

Level 0 (L0): The log file (LogFile) is considered level 0 (L0). Once this file exceeds a 5MB threshold, InfluxDB creates a new active log file, and the previous one begins compacting into an IndexFile. This first index file is at level 1 (L1).
Level 1 (L1): InfluxDB flushes all newly written data held in an in-memory cache to disk into an IndexFile.
Level 2 (L2): InfluxDB compacts up to eight L1-compacted files into one or more L2 files by combining multiple blocks containing the same series into fewer blocks in one or more new files.
Level 3 (L3): InfluxDB iterates over L2-compacted file blocks (over a certain size) and combines multiple blocks containing the same series into one block in a new file.
Level 4 (L4): Full compaction InfluxDB iterates over L3-compacted file blocks and combines multiple blocks containing the same series into one block in a new file.

InfluxDB schedules compactions preferentially, using the following guidelines:

The lower the level (fewer times the file has been compacted), the more weight is given to compacting the file.
The more compactible files in a level, the higher the priority given to compacting that level. If the number of files in each level is equal, lower levels are compacted first.
If a higher level has more candidates for compaction, it may be compacted before a lower level. InfluxDB multiplies the number of collection groups (collections of files to compact into a single next-generation file) by a specified weight (0.4, 0.3, 0.2, and 0.1) per level, to determine the compaction priority.

The following configuration settings are especially beneficial for systems with irregular loads, because they limit compactions during periods of high usage, and let compactions catch up during periods of lower load:

In systems with stable loads, if compactions interfere with other operations, typically, the system is undersized for its load, and configuration changes won’t help much.

Shard deletion

The InfluxDB retention enforcement service routinely checks for shard groups older than their bucket’s retention period. Once the start time of a shard group is beyond the bucket’s retention period, InfluxDB deletes the shard group and associated shards and TSM files.

In buckets with an infinite retention period, shards remain on disk indefinitely.

InfluxDB only deletes cold shards

InfluxDB only deletes cold shards. If backfilling data beyond a bucket’s retention period, the backfilled data will remain on disk until the following occurs:

The shard returns to a cold state.
The retention enforcement service deletes the shard group.

storage-retention-check-interval

storage internals

Was this page helpful?

Thank you for your feedback!

Support and feedback

Thank you for being part of our community! We welcome and encourage your feedback and bug reports for InfluxDB OSS v2 and this documentation. To find support, use the following resources:

Customers with an annual or support contract can contact InfluxData Support.

Edit this page Submit docs issue Submit InfluxDB OSS v2 issue

InfluxDB shards and shard groups

API token hashing is enabled by default in InfluxDB OSS 2.9.0

Shards

Shard groups

Shard group duration

Shard group duration configuration options

Shard group diagram

Shard life cycle

Shard precreation

Shard writes

Shard compaction

Shard deletion

InfluxDB only deletes cold shards

Support and feedback

InfluxDB OSS 2.9.0: API tokens are hashed by default

Key enhancements in Explorer 1.8

InfluxDB 3.9: Performance upgrade preview

Telegraf Enterprise now in public beta

Telegraf Controller v0.0.7-beta now available

InfluxDB Docker latest tag changing to InfluxDB 3 Core

InfluxDB shards and shard groups

API token hashing is enabled by default in InfluxDB OSS 2.9.0

Shards

Shard groups

Shard group duration

Shard group duration configuration options

Shard group diagram

Shard life cycle

Shard precreation

Shard precreation-related configuration settings

Shard writes

Shard compaction

Shard compaction-related configuration settings

Shard deletion

InfluxDB only deletes cold shards

Retention enforcement-related configuration settings

Related

Support and feedback

What is your InfluxDB OSS URL?

Default

Custom

Thank you for your feedback!

InfluxDB OSS 2.9.0: API tokens are hashed by default

Key enhancements in Explorer 1.8

InfluxDB 3.9: Performance upgrade preview

Telegraf Enterprise now in public beta

Telegraf Controller v0.0.7-beta now available

InfluxDB Docker latest tag changing to InfluxDB 3 Core