join.tables() function

Flux 0.172.0+
View InfluxDB support

join.tables() joins two input streams together using a specified method, predicate, and a function to join two corresponding records, one from each input stream.

join.tables() only compares records with the same group key. Output tables have the same grouping as the input tables.

Function type signature

(
    <-left: stream[A],
    as: (l: A, r: B) => C,
    method: string,
    on: (l: A, r: B) => bool,
    right: stream[B],
) => stream[C] where A: Record, B: Record, C: Record

For more information, see Function type signatures.

Parameters

left

Left input stream. Default is piped-forward data (<-).

right

(Required) Right input stream.

on

(Required) Function that takes a left and right record (l, and r respectively), and returns a boolean.

The body of the function must be a single boolean expression, consisting of one or more equality comparisons between a property of l and a property of r, each chained together by the and operator.

as

(Required) Function that takes a left and a right record (l and r respectively), and returns a record. The returned record is included in the final output.

method

(Required) String that specifies the join method.

Supported methods:

inner
left
right
full

Perform an inner join

import "sampledata"
import "join"

ints = sampledata.int()
strings = sampledata.string()

join.tables(
    method: "inner",
    left: ints,
    right: strings,
    on: (l, r) => l._time == r._time,
    as: (l, r) => ({l with label: r._value}),
)

View example output

_time	_value	label	*tag
2021-01-01T00:00:00Z	-2	smpl_g9qczs	t1
2021-01-01T00:00:10Z	10	smpl_0mgv9n	t1
2021-01-01T00:00:20Z	7	smpl_phw664	t1
2021-01-01T00:00:30Z	17	smpl_guvzy4	t1
2021-01-01T00:00:40Z	15	smpl_5v3cce	t1
2021-01-01T00:00:50Z	4	smpl_s9fmgy	t1

_time	_value	label	*tag
2021-01-01T00:00:00Z	19	smpl_b5eida	t2
2021-01-01T00:00:10Z	4	smpl_eu4oxp	t2
2021-01-01T00:00:20Z	-3	smpl_5g7tz4	t2
2021-01-01T00:00:30Z	19	smpl_sox1ut	t2
2021-01-01T00:00:40Z	13	smpl_wfm757	t2
2021-01-01T00:00:50Z	1	smpl_dtn2bv	t2

Perform a left outer join

If the join method is anything other than inner, pay special attention to how the output record is constructed in the as function.

Because of how flux handles outer joins, it’s possible for either l or r to be a default record. This means any value in a non-group-key column could be null.

For more information about the behavior of outer joins, see the Outer joins section in the join package documentation.

In the case of a left outer join, l is guaranteed to not be a default record. To ensure that the output record has non-null values for any columns that aren’t part of the group key, use values from l. Using a non-group-key value from r risks that value being null.

The example below constructs the output record almost entirely from properties of l. The only exception is the v_right column which gets its value from r._value. In this case, understand and expect that v_right will sometimes be null.

import "array"
import "join"

left =
    array.from(
        rows: [
            {_time: 2022-01-01T00:00:00Z, _value: 1, label: "a"},
            {_time: 2022-01-01T00:00:00Z, _value: 2, label: "b"},
            {_time: 2022-01-01T00:00:00Z, _value: 3, label: "d"},
        ],
    )
right =
    array.from(
        rows: [
            {_time: 2022-01-01T00:00:00Z, _value: 0.4, id: "a"},
            {_time: 2022-01-01T00:00:00Z, _value: 0.5, id: "c"},
            {_time: 2022-01-01T00:00:00Z, _value: 0.6, id: "d"},
        ],
    )

join.tables(
    method: "left",
    left: left,
    right: right,
    on: (l, r) => l.label == r.id and l._time == r._time,
    as: (l, r) => ({_time: l._time, label: l.label, v_left: l._value, v_right: r._value}),
)

View example output

_time	label	v_left	v_right
2022-01-01T00:00:00Z	a	1	0.4
2022-01-01T00:00:00Z	b	2
2022-01-01T00:00:00Z	d	3	0.6

Perform a right outer join

The next example is nearly identical to the previous example, but uses the right join method. With this method, r is guaranteed to not be a default record, but l may be a default record. Because l is more likely to contain null values, the output record is built almost entirely from properties of r, with the exception of v_left, which we expect to sometimes be null.

import "array"
import "join"

left =
    array.from(
        rows: [
            {_time: 2022-01-01T00:00:00Z, _value: 1, label: "a"},
            {_time: 2022-01-01T00:00:00Z, _value: 2, label: "b"},
            {_time: 2022-01-01T00:00:00Z, _value: 3, label: "d"},
        ],
    )
right =
    array.from(
        rows: [
            {_time: 2022-01-01T00:00:00Z, _value: 0.4, id: "a"},
            {_time: 2022-01-01T00:00:00Z, _value: 0.5, id: "c"},
            {_time: 2022-01-01T00:00:00Z, _value: 0.6, id: "d"},
        ],
    )

join.tables(
    method: "right",
    left: left,
    right: right,
    on: (l, r) => l.label == r.id and l._time == r._time,
    as: (l, r) => ({_time: r._time, label: r.id, v_left: l._value, v_right: r._value}),
)

View example output

_time	label	v_left	v_right
2022-01-01T00:00:00Z	a	1	0.4
2022-01-01T00:00:00Z	c		0.5
2022-01-01T00:00:00Z	d	3	0.6

Perform a full outer join

In a full outer join, there are no guarantees about l or r. Either one of them could be a default record, but they will never both be a default record at the same time.

To get non-null values for the output record, check both l and r to see which contains the desired values.

The example below defines a function for the as parameter that appropriately handles the uncertainty of a full outer join.

v_left and v_right still use values from l and r directly, because we expect them to sometimes be null in the output table.

import "array"
import "join"

left =
    array.from(
        rows: [
            {_time: 2022-01-01T00:00:00Z, _value: 1, label: "a"},
            {_time: 2022-01-01T00:00:00Z, _value: 2, label: "b"},
            {_time: 2022-01-01T00:00:00Z, _value: 3, label: "d"},
        ],
    )
right =
    array.from(
        rows: [
            {_time: 2022-01-01T00:00:00Z, _value: 0.4, id: "a"},
            {_time: 2022-01-01T00:00:00Z, _value: 0.5, id: "c"},
            {_time: 2022-01-01T00:00:00Z, _value: 0.6, id: "d"},
        ],
    )

join.tables(
    method: "full",
    left: left,
    right: right,
    on: (l, r) => l.label == r.id and l._time == r._time,
    as: (l, r) => {
        time = if exists l._time then l._time else r._time
        label = if exists l.label then l.label else r.id

        return {_time: time, label: label, v_left: l._value, v_right: r._value}
    },
)

View example output

_time	label	v_left	v_right
2022-01-01T00:00:00Z	a	1	0.4
2022-01-01T00:00:00Z	b	2
2022-01-01T00:00:00Z	c		0.5
2022-01-01T00:00:00Z	d	3	0.6

transformations

Was this page helpful?

Thank you for your feedback!

Support and feedback

Thank you for being part of our community! We welcome and encourage your feedback and bug reports for Flux and this documentation. To find support, use the following resources:

Customers with an annual or support contract can contact InfluxData Support.

Edit this page Submit docs issue Submit Flux issue

join.tables() function

Function type signature

Parameters

left

right

on

as

method

Examples

Perform an inner join

Output data

Perform a left outer join

Output data

Perform a right outer join

Output data

Perform a full outer join

Output data

Support and feedback

New in InfluxDB 3.8

InfluxDB Docker latest tag changing to InfluxDB 3 Core

join.tables() function

Function type signature

Parameters

left

right

on

as

method

Examples

Perform an inner join

Output data

Perform a left outer join

Output data

Perform a right outer join

Output data

Perform a full outer join

Output data

Related

Support and feedback

Where are you running InfluxDB?

AWS

GCP

Azure

Default

Custom

Thank you for your feedback!

InfluxDB support for join.tables()

InfluxDB Open Source (OSS)

InfluxDB Enterprise

New in InfluxDB 3.8

InfluxDB Docker latest tag changing to InfluxDB 3 Core

InfluxDB support for `join.tables()`