Aggregate or apply selector functions to data

A SQL query that aggregates data includes the following clauses:

* Required
  • * SELECT: Specify fields, tags, and calculations to output from a measurement or use the wild card alias (*) to select all fields and tags from a measurement.
  • * FROM: Specify the measurement to query data from.
  • WHERE: Only return data that meets the specified conditions–for example, falls within a time range, contains specific tag values, or contains a field value outside a specified range.
  • GROUP BY: Group data that have the same values for specified columns and expressions (for example, an aggregate function result).

For simplicity, the term “aggregate” in this guide refers to applying both aggregate and selector functions to a dataset.

Learn how to apply aggregate operations to your queried data:

Aggregate and selector functions

Both aggregate and selector functions return a single row from each SQL partition or group. For example, if you GROUP BY room and perform an aggregate operation in your SELECT clause, results include an aggregate value for each unique value of room.

Aggregate functions

Use aggregate functions to aggregate values in a specified column for each group and return a single row per group containing the aggregate value.

View aggregate functions

Basic aggregate query
SELECT AVG(co) from home

Selector functions

Use selector functions to “select” a value from a specified column. The available selector functions are designed to work with time series data.

View selector functions

Each selector function returns a Rust struct (similar to a JSON object) representing a single time and value from the specified column in the each group. What time and value get returned depend on the logic in the selector function. For example, selector_first returns the value of specified column in the first row of the group. selector_max returns the maximum value of the specified column in the group.

Selector struct schema

The struct returned from a selector function has two properties:

  • time: time value in the selected row
  • value: value of the specified column in the selected row
{time: 2023-01-01T00:00:00Z, value: 72.1}

Use selector functions

Each selector function has two arguments:

  • The first is the column to operate on.
  • The second is the time column to use in the selection logic.

In your SELECT statement, execute a selector function and use bracket notation to reference properties of the returned struct to populate the column value:

  selector_first(temp, time)['time'] AS time,
  selector_first(temp, time)['value'] AS temp,
FROM home

Example aggregate queries

Sample data

The following examples use the sample data written in the Get started writing data guide. To run the example queries and return results, write the sample data to your InfluxDB Cloud Serverless bucket before running the example queries.

Perform an ungrouped aggregation

To aggregate all queried values in a specified column:

  • Use aggregate or selector functions in your SELECT statement.
  • Do not include a GROUP BY clause to leave your data ungrouped.
SELECT avg(co) AS 'average co' from home

View example results

Group and aggregate data

To apply aggregate or selector functions to grouped data:

  • Use aggregate or selector functions in your SELECT statement.
  • Include columns to group by in your SELECT statement.
  • Include a GROUP BY clause with a comma-delimited list of columns and expressions to group by.

Keep the following in mind when using GROUP BY:

  • GROUP BY can use column aliases that are defined in the SELECT clause.
  • GROUP BY can’t use an alias named time. If you include time in GROUP BY, it always uses the measurement time column.
  avg(temp) AS 'average temp'
FROM home

View example results

Downsample data by applying interval-based aggregates

A common use case when querying time series is downsampling data by applying aggregates to time-based groups. To group and aggregate data into time-based groups:

  • In your SELECT clause:

    • Use the DATE_BIN function to calculate time intervals and output a column that contains the start of the interval nearest to the time timestamp in each row–for example, the following clause calculates two-hour intervals starting at 1970-01-01T00:00:00Z and returns a new time column that contains the start of the interval nearest to home.time:

      SELECT DATE_BIN(INTERVAL '2 hours', time, '1970-01-01T00:00:00Z'::TIMESTAMP) AS time
      from home

      Given a time value 2023-03-09T13:00:50.000Z, the output time column contains 2023-03-09T12:00:00.000Z.

    • Use aggregate or selector functions on specified columns.

  • In your GROUP BY clause:

    • Use the DATE_BIN function with the same parameters used in the SELECT clause.
    • Specify other columns (for example, room) that are specified in the SELECT clause and aren’t used in a selector function.
    SELECT DATE_BIN(INTERVAL '2 hours', time, '1970-01-01T00:00:00Z'::TIMESTAMP) AS time
    GROUP BY DATE_BIN(INTERVAL '2 hours', time, '1970-01-01T00:00:00Z'::TIMESTAMP), room
  • Include an ORDER BY clause with columns to sort by.

The following example retrieves unique combinations of time intervals and rooms with their minimum, maximum, and average temperatures.

  DATE_BIN(INTERVAL '2 hours', time, '1970-01-01T00:00:00Z'::TIMESTAMP) AS time,
  selector_max(temp, time)['value'] AS 'max temp',
  selector_min(temp, time)['value'] AS 'min temp',
  avg(temp) AS 'average temp'
FROM home
GROUP BY DATE_BIN(INTERVAL '2 hours', time, '1970-01-01T00:00:00Z'::TIMESTAMP), room
ORDER BY room, time

View example results

Query rows based on aggregate values

To query data based on values after an aggregate operation, include a HAVING clause with defined predicate conditions such as a value threshold. Predicates in the WHERE clause are applied before data is aggregated. Predicates in the HAVING clause are applied after data is aggregated.

  avg(co) AS 'average co'
FROM home
HAVING "average co" > 5

View example results

Was this page helpful?

Thank you for your feedback!

Introducing InfluxDB 3.0

The new core of InfluxDB built with Rust and Apache Arrow. Available today in InfluxDB Cloud Dedicated.

Learn more

State of the InfluxDB Cloud Serverless documentation

The new documentation for InfluxDB Cloud Serverless is a work in progress. We are adding new information and content almost daily. Thank you for your patience!

If there is specific information you’re looking for, please submit a documentation issue.

InfluxDB Cloud Serverless powered by IOx