Documentation

Join on time

Use join.time() to join two streams of data based on time values in the _time column. This type of join operation is common when joining two streams of time series data.

join.time() can use any of the available join methods. Which method you use depends on your desired behavior:

  • inner (Default): Drop any rows from both input streams that do not have a matching row in the other stream.

  • left: Output a row for each row in the left data stream with data matching from the right data stream. If there is no matching data in the right data stream, non-group-key columns with values from the right data stream are null.

  • right: Output a row for each row in the right data stream with data matching from the left data stream. If there is no matching data in the left data stream, non-group-key columns with values from the left data stream are null.

  • full: Output a row for all rows in both the left and right input streams and join rows that match based on their _time value.

Use join.time to join your data

  1. Import the join package.

  2. Define the left and right data streams to join:

    • Each stream must also have a _time column.
    • Each stream must have one or more columns with common values. Column labels do not need to match, but column values do.
    • Each stream should have identical group keys.

    For more information, see join data requirements.

  3. Use join.time() to join the two streams together based on time values. Provide the following parameters:

    • left: (Required) Stream of data representing the left side of the join.
    • right: (Required) Stream of data representing the right side of the join.
    • as: (Required) Join output function that returns a record with values from each input stream. For example: (l, r) => ({r with column1: l.column1, column2: l.column2}).
    • method: Join method to use. Default is inner.

The following example uses a filtered selections from the machineProduction sample data set as the left and right data streams.

Example data grouping

The example below regroups both the left and right streams to remove the _field column from the group key. Because join.time() only compares tables with matching group key instances, to join streams with different _field column values, _field cannot be part of the group key.

import "influxdata/influxdb/sample"
import "join"

left =
    sample.data(set: "machineProduction")
        |> filter(fn: (r) => r.stationID == "g1" or r.stationID == "g2" or r.stationID == "g3")
        |> filter(fn: (r) => r._field == "pressure")
        |> limit(n: 5)
        |> group(columns: ["_time", "_value", "_field"], mode: "except")

right =
    sample.data(set: "machineProduction")
        |> filter(fn: (r) => r.stationID == "g1" or r.stationID == "g2" or r.stationID == "g3")
        |> filter(fn: (r) => r._field == "pressure_target")
        |> limit(n: 5)
        |> group(columns: ["_time", "_value", "_field"], mode: "except")

join.time(method: "left", left: left, right: right, as: (l, r) => ({l with target: r._value}))

View example input and output data


Was this page helpful?

Thank you for your feedback!


The future of Flux

Flux is going into maintenance mode. You can continue using it as you currently are without any changes to your code.

Flux is going into maintenance mode and will not be supported in InfluxDB 3.0. This was a decision based on the broad demand for SQL and the continued growth and adoption of InfluxQL. We are continuing to support Flux for users in 1.x and 2.x so you can continue using it with no changes to your code. If you are interested in transitioning to InfluxDB 3.0 and want to future-proof your code, we suggest using InfluxQL.

For information about the future of Flux, see the following: