Documentation

GROUP BY clause

InfluxDB 3 Enterprise is in Public Alpha

InfluxDB 3 Enterprise is in public alpha and available for testing and feedback, but is not meant for production use. Both the product and this documentation are works in progress. We welcome and encourage your input about your experience with the alpha and invite you to join our public channels for updates and to share feedback.

Alpha expectations and recommendations

Use the GROUP BY clause to group data by one or more specified tags or into specified time intervals. GROUP BY requires an aggregate or selector function in the SELECT statement.

Syntax

SELECT_clause FROM_clause [WHERE_clause] GROUP BY group_expression[, ..., group_expression_n]

GROUP BY clause behaviors

  • GROUP BY tag_key - Groups data by a specific tag
  • GROUP BY tag_key1, tag_key2 - Groups data by more than one tag
  • GROUP BY * - Groups data by all tags
  • GROUP BY /regex/ - Groups data by tag keys that match the regular expression
  • GROUP BY time() - Groups data into time intervals (windows)

If a query includes WHERE and GROUP BY, the GROUP BY clause must appear after the WHERE clause.

GROUP BY tags

Groups data by one or more tag columns.

GROUP BY tags examples

The following examples use the Bitcoin price sample data.

Group data by a single tag

Group data by more than one tag

Group data by all tags

Group data by tag keys that match a regular expression

GROUP BY time

GROUP BY time() groups data by into specified time intervals, also known as “windows”, and applies the aggregate and selector functions in the SELECT clause to each interval. Use the time() function to specify the time interval to group by.

SELECT_clause FROM_clause WHERE <time_range> GROUP BY time(time_interval[, offset])[, group_expression (...)] [fill(behavior)]

GROUP BY time() intervals use preset round-number time boundaries that are independent of time conditions in the WHERE clause. Output data uses window start boundaries as the aggregate timestamps. Use the offset argument of the time() function to shift time boundaries forward or backward in time.

GROUP by time and fill gaps

When grouping by time, if a window in the queried time range does not contain data, results return a row for the empty window containing the timestamp of the empty window and null values for each queried field. Use the fill() function at the end of the GROUP BY clause to replace null field values. If no FILL clause is included, the default behavior is fill(null).

fill() provides the following behaviors for filling values:

  • numeric literal: Replaces null values with the specified numeric literal.
  • linear: Uses linear interpolation between existing values to replace null values.
  • none: Removes rows with null field values.
  • null: Keeps null values and associated timestamps.
  • previous: Replaces null values with the most recent non-null value.

See the fill() documentation for detailed examples.

GROUP BY time examples

The following examples use the Bitcoin price sample data.

Group and aggregate query results into 1 hour windows

Group and aggregate query results into 1 week intervals by tag

GROUP BY time with offset

Group and aggregate query results into 1 hour intervals and offset time boundaries by +15 minutes

Group and aggregate query results into 1 hour intervals and offset time boundaries by -15 minutes

GROUP BY time and fill gaps

Group and aggregate query results into 30 minute intervals and fill gaps with 0

Group and aggregate query results into 30 minute intervals and fill gaps using linear interpolation

Group and aggregate query results into 30 minute intervals and fill gaps with previous values

Result set

If at least one row satisfies the query, InfluxDB 3 Enterprise returns row data in the query result set. If a query uses a GROUP BY clause, the result set includes the following:

  • Columns listed in the query’s SELECT clause
  • A time column that contains the timestamp for the record or the group
  • An iox::measurement column that contains the record’s measurement (table) name
  • Columns listed in the query’s GROUP BY clause; each row in the result set contains the values used for grouping

Default time range

If a query doesn’t specify a time range in the WHERE clause, InfluxDB uses the default time range for filtering and grouping by time. If a query includes the GROUP BY clause and doesn’t specify a time range in the WHERE clause, the default time group is the default time range, and the time column in the result set contains the start of the range–for example:

SELECT mean(temp) FROM home GROUP BY room

name: home
tags: room=Kitchen

timemean
1970-01-01T00:00:00Z22.623076923076926

name: home
tags: room=Living Room

timemean
1970-01-01T00:00:00Z22.16923076923077

Notable behaviors of the GROUP BY clause

Cannot group by fields

InfluxQL does not support grouping data by fields.

Tag order does not matter

The order that tags are listed in the GROUP BY clause does not affect how data is grouped.

Grouping by tag and no time range returns unexpected timestamps

The time column contains the start of the default time range.

Data grouped by time may return unexpected timestamps

Because GROUP BY time() intervals use preset round-number time boundaries that are independent of time conditions in the WHERE clause, results may include timestamps outside of the queried time range. Results represent only data with timestamps in the specified time range, but output timestamps are determined by by the preset time boundaries.

The following example groups data by 1-hour intervals, but the time range defined in the WHERE clause covers only part of a window:

SELECT MEAN(field)
FROM example 
WHERE
  time >= '2022-01-01T00:30:00Z'
  AND time <= '2022-01-01T01:30:00Z'
GROUP BY time(1h)

Note: The timestamp in the first row of query results data occurs before the start of the queried time range. See why.

Example data

timefield
2022-01-01T00:00:00Z8
2022-01-01T00:15:00Z4
2022-01-01T00:30:00Z0
2022-01-01T00:45:00Z8
2022-01-01T01:00:00Z5
2022-01-01T01:15:00Z0
2022-01-01T01:30:00Z8
2022-01-01T01:45:00Z8
2022-01-01T02:00:00Z9
2022-01-01T02:15:00Z6
2022-01-01T02:30:00Z3
2022-01-01T02:45:00Z0

Query results

timefield
2022-01-01T00:00:00Z4
2022-01-01T01:00:00Z5.25
2022-01-01T02:00:00Z6

Why do these results include timestamps outside of the queried time range?

Fill with no data in the queried time range

Queries ignore fill() if no data exists in the queried time range. This is the expected behavior.

Fill with previous if no previous value exists

fill(previous) doesn’t fill null values if there is no previous value in the queried time range.

Fill with linear interpolation if there are not two values to interpolate between

fill(linear) doesn’t fill null values if there are no values before or after the null value in the queried time range.


Was this page helpful?

Thank you for your feedback!


The future of Flux

Flux is going into maintenance mode. You can continue using it as you currently are without any changes to your code.

Read more

InfluxDB 3 Open Source Now in Public Alpha

InfluxDB 3 Open Source is now available for alpha testing, licensed under MIT or Apache 2 licensing.

We are releasing two products as part of the alpha.

InfluxDB 3 Core, is our new open source product. It is a recent-data engine for time series and event data. InfluxDB 3 Enterprise is a commercial version that builds on Core’s foundation, adding historical query capability, read replicas, high availability, scalability, and fine-grained security.

For more information on how to get started, check out: