InfluxDB Clustered data durability

How data flows through InfluxDB Clustered

When data is written to InfluxDB Clustered, it progresses through multiple stages to ensure durability, optimized performance and storage, and efficient querying. Configuration options at each stage affect system behavior, balancing reliability and resource usage.

Figure: Write request, response, and ingest flow for InfluxDB Clustered

Data ingest
Data storage
Data deletion
Backups

Write validation and memory buffer

The Router validates incoming data to prevent malformed or unsupported data from entering the system. InfluxDB Clustered writes accepted data to multiple write-ahead log (WAL) files on Ingester pods’ local storage (default is 2 for redundancy) before acknowledging the write request. The Ingester holds the data in memory to ensure leading-edge data is available for querying.

Write-ahead log (WAL) persistence

Ingesters persist the contents of the WAL to Parquet files in object storage and updates the Catalog to reference the newly created Parquet files. InfluxDB Clustered retains WALs until the data is persisted.

If an Ingester node is gracefully shut down (for example, during a new software deployment), it flushes the contents of the WAL to the Parquet files before shutting down.

Data storage

In InfluxDB Clustered, all measurements are stored in Apache Parquet files that represent a point-in-time snapshot of the data. The Parquet files are immutable and are never replaced nor modified. Parquet files are stored in object storage and referenced in the Catalog, which InfluxDB uses to find the appropriate Parquet files for a particular set of data.

Data deletion

When data is deleted or expires (reaches the database’s retention period), InfluxDB performs the following steps:

Marks the associated Parquet files as deleted in the catalog.
Filters out data marked for deletion from all queries.

Backups

InfluxDB Clustered implements the following data backup strategies:

Backup of WAL file: The WAL file is written on locally attached storage. If an ingester process fails, the new ingester simply reads the WAL file on startup and continues normal operation. WAL files are maintained until their contents have been written to the Parquet files in object storage. For added protection, ingesters can be configured for write replication, where each measurement is written to two different WAL files before acknowledging the write.
Backup of Parquet files: Parquet files are stored in object storage
Backup of catalog: InfluxData keeps a transaction log of all recent updates to the InfluxDB catalog and generates a daily backup of the catalog.

backups internals

Was this page helpful?

Thank you for your feedback!

Support and feedback

Thank you for being part of our community! We welcome and encourage your feedback and bug reports for InfluxDB Clustered and this documentation. To find support, use the following resources:

Customers with an annual or support contract can contact InfluxData Support.

Edit this page Submit docs issue Submit InfluxDB Clustered issue

InfluxDB Clustered data durability

How data flows through InfluxDB Clustered

Data ingest

Write validation and memory buffer

Write-ahead log (WAL) persistence

Data storage

Data deletion

Backups

Support and feedback

Telegraf Enterprise now in public beta

InfluxDB Docker latest tag changing to InfluxDB 3 Core

InfluxDB Clustered data durability

How data flows through InfluxDB Clustered

Data ingest

Write validation and memory buffer

Write-ahead log (WAL) persistence

Data storage

Data deletion

Backups

Related

Support and feedback

What is your InfluxDB cluster URL?

Enter cluster URL

Thank you for your feedback!

Telegraf Enterprise now in public beta

InfluxDB Docker latest tag changing to InfluxDB 3 Core