InfluxDB design insights and tradeoffs

This page documents an earlier version of InfluxDB. InfluxDB v2 is the latest stable version. See the equivalent InfluxDB v2 documentation: InfluxDB design principles.

InfluxDB is a time series database. Optimizing for this use case entails some tradeoffs, primarily to increase performance at the cost of functionality. Below is a list of some of those design insights that lead to tradeoffs:

  1. For the time series use case, we assume that if the same data is sent multiple times, it is the exact same data that a client just sent several times.

    Pro: Simplified conflict resolution increases write performance.
    Con: Cannot store duplicate data; may overwrite data in rare circumstances.

  2. Deletes are a rare occurrence. When they do occur it is almost always against large ranges of old data that are cold for writes.

    Pro: Restricting access to deletes allows for increased query and write performance.
    Con: Delete functionality is significantly restricted.

  3. Updates to existing data are a rare occurrence and contentious updates never happen. Time series data is predominantly new data that is never updated.

    Pro: Restricting access to updates allows for increased query and write performance.
    Con: Update functionality is significantly restricted.

  4. The vast majority of writes are for data with very recent timestamps and the data is added in time ascending order.

    Pro: Adding data in time ascending order is significantly more performant.
    Con: Writing points with random times or with time not in ascending order is significantly less performant.

  5. Scale is critical. The database must be able to handle a high volume of reads and writes.

    Pro: The database can handle a high volume of reads and writes.
    Con: The InfluxDB development team was forced to make tradeoffs to increase performance.

  6. Being able to write and query the data is more important than having a strongly consistent view.

    Pro: Writing and querying the database can be done by multiple clients and at high loads.
    Con: Query returns may not include the most recent points if database is under heavy load.

  7. Many time series are ephemeral. There are often time series that appear only for a few hours and then go away, e.g. a new host that gets started and reports for a while and then gets shut down.

    Pro: InfluxDB is good at managing discontinuous data.
    Con: Schema-less design means that some database functions are not supported e.g. there are no cross table joins.

  8. No one point is too important.

    Pro: InfluxDB has very powerful tools to deal with aggregate data and large data sets.
    Con: Points don’t have IDs in the traditional sense, they are differentiated by timestamp and series.

Was this page helpful?

Thank you for your feedback!

The future of Flux

Flux is going into maintenance mode. You can continue using it as you currently are without any changes to your code.

Read more