Documentation

InfluxDB Cloud data durability

InfluxDB Cloud replicates all data in the storage tier across two availability zones in a cloud region, automatically creates backups, and verifies that replicated data is consistent and readable.

On this page

Data replication

InfluxDB Cloud replicates data in both the write tier and the storage tier.

  • Write tier: all data written to InfluxDB is processed by a durable message queue. The message queue partitions each batch of points based off series keys and then replicates each partition across other physical nodes in the message queue.
  • Storage tier: all data in the underlying storage tier is replicated across two availability zones in a cloud region.

Backup processes

InfluxDB Cloud backs up all data in the following way:

Backup on write

All inbound write requests to InfluxDB Cloud are added to a durable message queue. The message queue does the following:

  1. Caches the line protocol of each write request.
  2. Writes data to the storage tier.
  3. Routinely persists cached line protocol to object storage as an out-of-band backup.

Message queue backups provide raw line protocol that can be used to recover from catastrophic failure in the storage tier or an accidental deletion. The durability of the message queue is 96 hours, meaning InfluxDB Cloud can sustain a failure of its underlying storage tier or object storage services for up to 96 hours without any data loss.

To minimize potential data loss due to defects introduced in the InfluxDB Cloud service, we minimize the code used between the data ingest and backup processes.

Backup after compaction

The InfluxDB storage engine compresses data over time in a process known as compaction. When each compaction cycle completes, InfluxDB Cloud stores compressed TSM files in object storage.

Periodic TSM snapshots

To provide multiple data recovery points, InfluxDB Cloud takes weekly snapshots of TSM files uploaded to object storage. The TSM snapshot includes a copy of all (non-deleted) data when the snapshot is created. These snapshots are preserved for 100 days.

Recovery

InfluxDB Cloud uses the following out-of-band backups stored in object storage to recover data:

  • Message queue backup: line protocol from inbound write requests within the last 96 hours
  • Compaction backup: TSM files
  • TSM snapshots: Weekly snapshots of TSM files in objectstore

The Recovery Point Objective (RPO) is any accepted write. The Recovery Time Objective (RTO) is harder to definitively predict as potential failure modes can vary. While most common failure modes can be resolved within minutes or hours, critical failure modes may take longer. For example, if we need to rebuild all data from the TSM snapshots and message queue backup, it could take 24 hours or longer.

Data verification

InfluxDB Cloud has two data verification services running at all times:

  • Entropy detection: ensures that replicated data is consistent
  • Data verification: verifies that data written to InfluxDB is readable

InfluxDB Cloud status

InfluxDB Cloud regions and underlying services are monitored at all times. For information about the current status of InfluxDB Cloud, see the InfluxDB Cloud status page.


Was this page helpful?

Thank you for your feedback!


InfluxDB 3.9: Performance upgrade preview

InfluxDB 3 Enterprise 3.9 includes a beta of major performance upgrades with faster single-series queries, wide-and-sparse table support, and more.

InfluxDB 3 Enterprise 3.9 includes a beta of major performance and feature updates.

Key improvements:

  • Faster single-series queries
  • Consistent resource usage
  • Wide-and-sparse table support
  • Automatic distinct value caches for reduced latency with metadata queries

Preview features are subject to breaking changes.

For more information, see:

Telegraf Enterprise now in public beta

Get early access to the Telegraf Controller and provide feedback to help shape the future of Telegraf Enterprise.

See the Blog Post

The upcoming Telegraf Enterprise offering is for organizations running Telegraf at scale and is comprised of two key components:

  • Telegraf Controller: A control plane (UI + API) that centralizes Telegraf configuration management and agent health visibility.
  • Telegraf Enterprise Support: Official support for Telegraf Controller and Telegraf plugins.

Join the Telegraf Enterprise beta to get early access to the Telegraf Controller and provide feedback to help shape the future of Telegraf Enterprise.

For more information:

Telegraf Controller v0.0.6-beta now available

Telegraf Controller v0.0.6-beta is now available with new features, improvements, and bug fixes.

View the release notes
Download Telegraf Controller v0.0.6-beta

InfluxDB Docker latest tag changing to InfluxDB 3 Core

On May 27, 2026, the latest tag for InfluxDB Docker images will point to InfluxDB 3 Core. To avoid unexpected upgrades, use specific version tags in your Docker deployments.

If using Docker to install and run InfluxDB, the latest tag will point to InfluxDB 3 Core. To avoid unexpected upgrades, use specific version tags in your Docker deployments. For example, if using Docker to run InfluxDB v2, replace the latest version tag with a specific version tag in your Docker pull command–for example:

docker pull influxdb:2

InfluxDB Cloud powered by TSM