Handle duplicate data points

This page documents an earlier version of InfluxDB OSS. InfluxDB 3 Core is the latest stable version.

API token hashing is enabled by default in InfluxDB OSS 2.9.0

Stronger token security: tokens are stored as hashes on disk, so a copy of the database file doesn’t expose usable tokens. Existing tokens are hashed on first startup and the original strings can’t be recovered afterward — capture any plaintext tokens you still need before you upgrade.

For more information, see Token hashing.

InfluxDB identifies unique data points by their measurement, tag set, and timestamp (each a part of Line protocol used to write data to InfluxDB).

web,host=host2,region=us_west firstByte=15.0 1559260800000000000
--- -------------------------                -------------------
 |               |                                    |
Measurement   Tag set                             Timestamp

Duplicate data points

For points that have the same measurement name, tag set, and timestamp, InfluxDB creates a union of the old and new field sets. For any matching field keys, InfluxDB uses the field value of the new point. For example:

# Existing data point
web,host=host2,region=us_west firstByte=24.0,dnsLookup=7.0 1559260800000000000

# New data point
web,host=host2,region=us_west firstByte=15.0 1559260800000000000

After you submit the new data point, InfluxDB overwrites firstByte with the new field value and leaves the field dnsLookup alone:

# Resulting data point
web,host=host2,region=us_west firstByte=15.0,dnsLookup=7.0 1559260800000000000

from(bucket: "example-bucket")
  |> range(start: 2019-05-31T00:00:00Z, stop: 2019-05-31T12:00:00Z)
  |> filter(fn: (r) => r._measurement == "web")

Table: keys: [_measurement, host, region]
               _time  _measurement   host   region  dnsLookup  firstByte
--------------------  ------------  -----  -------  ---------  ---------
2019-05-31T00:00:00Z           web  host2  us_west          7         15

Preserve duplicate points

To preserve both old and new field values in duplicate points, use one of the following strategies:

Add an arbitrary tag
Increment the timestamp

Add an arbitrary tag

Add an arbitrary tag with unique values so InfluxDB reads the duplicate points as unique.

For example, add a uniq tag to each data point:

# Existing point
web,host=host2,region=us_west,uniq=1 firstByte=24.0,dnsLookup=7.0 1559260800000000000

# New point
web,host=host2,region=us_west,uniq=2 firstByte=15.0 1559260800000000000

It is not necessary to retroactively add the unique tag to the existing data point. Tag sets are evaluated as a whole. The arbitrary uniq tag on the new point allows InfluxDB to recognize it as a unique point. However, this causes the schema of the two points to differ and may lead to challenges when querying the data.

After writing the new point to InfluxDB:

from(bucket: "example-bucket")
  |> range(start: 2019-05-31T00:00:00Z, stop: 2019-05-31T12:00:00Z)
  |> filter(fn: (r) => r._measurement == "web")

Table: keys: [_measurement, host, region, uniq]
               _time  _measurement   host   region  uniq  firstByte  dnsLookup
--------------------  ------------  -----  -------  ----  ---------  ---------
2019-05-31T00:00:00Z           web  host2  us_west     1         24          7

Table: keys: [_measurement, host, region, uniq]
               _time  _measurement   host   region  uniq  firstByte
--------------------  ------------  -----  -------  ----  ---------
2019-05-31T00:00:00Z           web  host2  us_west     2         15

Increment the timestamp

Increment the timestamp by a nanosecond to enforce the uniqueness of each point.

# Old data point
web,host=host2,region=us_west firstByte=24.0,dnsLookup=7.0 1559260800000000000

# New data point
web,host=host2,region=us_west firstByte=15.0 1559260800000000001

After writing the new point to InfluxDB:

from(bucket: "example-bucket")
  |> range(start: 2019-05-31T00:00:00Z, stop: 2019-05-31T12:00:00Z)
  |> filter(fn: (r) => r._measurement == "web")

Table: keys: [_measurement, host, region]
                         _time  _measurement   host   region  firstByte  dnsLookup
------------------------------  ------------  -----  -------  ---------  ---------
2019-05-31T00:00:00.000000000Z           web  host2  us_west         24          7
2019-05-31T00:00:00.000000001Z           web  host2  us_west         15

The output of examples queries in this article has been modified to clearly show the different approaches and results for handling duplicate data.

best practices write

Was this page helpful?

Thank you for your feedback!

Support and feedback

Thank you for being part of our community! We welcome and encourage your feedback and bug reports for InfluxDB OSS v2 and this documentation. To find support, use the following resources:

Customers with an annual or support contract can contact InfluxData Support.

Edit this page Submit docs issue Submit InfluxDB OSS v2 issue

Handle duplicate data points

API token hashing is enabled by default in InfluxDB OSS 2.9.0

Duplicate data points

Preserve duplicate points

Add an arbitrary tag

Increment the timestamp

Support and feedback

InfluxDB OSS 2.9.0: API tokens are hashed by default

Key enhancements in Explorer 1.9

InfluxDB 3.10 is now available

InfluxDB 3.10 is now available

Telegraf Enterprise is now generally available

InfluxDB Docker latest tag changing to InfluxDB 3 Core

Handle duplicate data points

API token hashing is enabled by default in InfluxDB OSS 2.9.0

Duplicate data points

Preserve duplicate points

Add an arbitrary tag

Increment the timestamp

Support and feedback

What is your InfluxDB OSS URL?

Default

Custom

Thank you for your feedback!

InfluxDB OSS 2.9.0: API tokens are hashed by default

Key enhancements in Explorer 1.9

InfluxDB 3.10 is now available

InfluxDB 3.10 is now available

Telegraf Enterprise is now generally available

InfluxDB Docker latest tag changing to InfluxDB 3 Core