Documentation

Join on time

Use join.time() to join two streams of data based on time values in the _time column. This type of join operation is common when joining two streams of time series data.

join.time() can use any of the available join methods. Which method you use depends on your desired behavior:

  • inner (Default): Drop any rows from both input streams that do not have a matching row in the other stream.

  • left: Output a row for each row in the left data stream with data matching from the right data stream. If there is no matching data in the right data stream, non-group-key columns with values from the right data stream are null.

  • right: Output a row for each row in the right data stream with data matching from the left data stream. If there is no matching data in the left data stream, non-group-key columns with values from the left data stream are null.

  • full: Output a row for all rows in both the left and right input streams and join rows that match based on their _time value.

Use join.time to join your data

  1. Import the join package.

  2. Define the left and right data streams to join:

    • Each stream must also have a _time column.
    • Each stream must have one or more columns with common values. Column labels do not need to match, but column values do.
    • Each stream should have identical group keys.

    For more information, see join data requirements.

  3. Use join.time() to join the two streams together based on time values. Provide the following parameters:

    • left: (Required) Stream of data representing the left side of the join.
    • right: (Required) Stream of data representing the right side of the join.
    • as: (Required) Join output function that returns a record with values from each input stream. For example: (l, r) => ({r with column1: l.column1, column2: l.column2}).
    • method: Join method to use. Default is inner.

The following example uses a filtered selections from the machineProduction sample data set as the left and right data streams.

Example data grouping

The example below regroups both the left and right streams to remove the _field column from the group key. Because join.time() only compares tables with matching group key instances, to join streams with different _field column values, _field cannot be part of the group key.

import "influxdata/influxdb/sample"
import "join"

left =
    sample.data(set: "machineProduction")
        |> filter(fn: (r) => r.stationID == "g1" or r.stationID == "g2" or r.stationID == "g3")
        |> filter(fn: (r) => r._field == "pressure")
        |> limit(n: 5)
        |> group(columns: ["_time", "_value", "_field"], mode: "except")

right =
    sample.data(set: "machineProduction")
        |> filter(fn: (r) => r.stationID == "g1" or r.stationID == "g2" or r.stationID == "g3")
        |> filter(fn: (r) => r._field == "pressure_target")
        |> limit(n: 5)
        |> group(columns: ["_time", "_value", "_field"], mode: "except")

join.time(method: "left", left: left, right: right, as: (l, r) => ({l with target: r._value}))

View example input and output


Was this page helpful?

Thank you for your feedback!


Upgrade to InfluxDB Cloud or InfluxDB 2.0!

InfluxDB Cloud and InfluxDB OSS 2.0 ready for production.