---
title: Aggregate data with SQL
description: Use aggregate and selector functions to perform aggregate operations on your time series data.
url: https://docs.influxdata.com/influxdb3/clustered/query-data/sql/aggregate-select/
estimated_tokens: 7350
product: InfluxDB Clustered
version: clustered
---

# Aggregate data with SQL

An SQL query that aggregates data includes the following clauses:

\* Required

-   \* `SELECT`: Specify fields, tags, and calculations to output from a measurement or use the wildcard alias (`*`) to select all fields and tags from a measurement.
-   \* `FROM`: Specify the measurement to query data from.
-   `WHERE`: Only return data that meets the specified conditions–for example, falls within a time range, contains specific tag values, or contains a field value outside a specified range.
-   `GROUP BY`: Group data that have the same values for specified columns and expressions (for example, an aggregate function result).

For simplicity, the term **“aggregate”** in this guide refers to applying both aggregate and selector functions to a dataset.

Learn how to apply aggregate operations to your queried data:

-   [Aggregate and selector functions](#aggregate-and-selector-functions)
    -   [Aggregate functions](#aggregate-functions)
    -   [Selector functions](#selector-functions)
-   [Example aggregate queries](#example-aggregate-queries)

## Aggregate and selector functions

Both aggregate and selector functions return a single row from each SQL partition or group. For example, if you `GROUP BY room` and perform an aggregate operation in your `SELECT` clause, results include an aggregate value for each unique value of `room`.

### Aggregate functions

Use **aggregate functions** to aggregate values in a specified column for each group and return a single row per group containing the aggregate value.

[View aggregate functions](/influxdb3/clustered/reference/sql/functions/aggregate/)

##### Basic aggregate query

```sql
SELECT AVG(co) from home
```

### Selector functions

Use **selector functions** to “select” a value from a specified column. The available selector functions are designed to work with time series data.

[View selector functions](/influxdb3/clustered/reference/sql/functions/selector/)

Each selector function returns a Rust *struct* (similar to a JSON object) representing a single time and value from the specified column in the each group. What time and value get returned depend on the logic in the selector function. For example, `selector_first` returns the value of specified column in the first row of the group. `selector_max` returns the maximum value of the specified column in the group.

#### Selector struct schema

The struct returned from a selector function has two properties:

-   **time**: `time` value in the selected row
-   **value**: value of the specified column in the selected row

```js
{time: 2023-01-01T00:00:00Z, value: 72.1}
```

#### Use selector functions

Each selector function has two arguments:

-   The first is the column to operate on.
-   The second is the time column to use in the selection logic.

In your `SELECT` statement, execute a selector function and use bracket notation to reference properties of the [returned struct](#selector-struct-schema) to populate the column value:

```sql
SELECT
  selector_first(temp, time)['time'] AS time,
  selector_first(temp, time)['value'] AS temp,
  room
FROM home
GROUP BY room
```

## Example aggregate queries

-   [Perform an ungrouped aggregation](#perform-an-ungrouped-aggregation)
-   [Group and aggregate data](#group-and-aggregate-data)
    -   [Downsample data by applying interval-based aggregates](#downsample-data-by-applying-interval-based-aggregates)
-   [Query rows based on aggregate values](#query-rows-based-on-aggregate-values)

#### Sample data

The following examples use the sample data written in the [Get started writing data guide](/influxdb3/clustered/get-started/write/). To run the example queries and return results, [write the sample data](/influxdb3/clustered/get-started/write/#write-line-protocol-to-influxdb) to your InfluxDB Clustered database before running the example queries.

### Perform an ungrouped aggregation

To aggregate *all* queried values in a specified column:

-   Use aggregate or selector functions in your `SELECT` statement.
-   Do not include a `GROUP BY` clause to leave your data ungrouped.

```sql
SELECT avg(co) AS 'average co' from home
```

[](#view-example-results)

View example results

| average co |
| --- |
| 5.269230769230769 |

### Group and aggregate data

To apply aggregate or selector functions to grouped data:

-   Use aggregate or selector functions in your `SELECT` statement.
-   Include columns to group by in your `SELECT` statement.
-   Include a `GROUP BY` clause with a comma-delimited list of columns and expressions to group by.

Keep the following in mind when using `GROUP BY`:

-   `GROUP BY` can use column aliases that are defined in the `SELECT` clause.
-   `GROUP BY` can’t use an alias named `time`. If you include `time` in `GROUP BY`, it always uses the measurement `time` column.

```sql
SELECT
  room,
  avg(temp) AS 'average temp'
FROM home
GROUP BY room
```

[](#view-example-results)

View example results

| room | average temp |
| --- | --- |
| Living Room | 22.16923076923077 |
| Kitchen | 22.623076923076926 |

#### Downsample data by applying interval-based aggregates

A common use case when querying time series is downsampling data by applying aggregates to time-based groups. To group and aggregate data into time-based groups:

-   In your `SELECT` clause:
    
    -   Use the [`DATE_BIN` function](/influxdb3/clustered/reference/sql/functions/time-and-date/#date_bin) to calculate time intervals and output a column that contains the start of the interval nearest to the `time` timestamp in each row–for example, the following clause calculates two-hour intervals starting at `1970-01-01T00:00:00Z` and returns a new `time` column that contains the start of the interval nearest to `home.time`:
        
        ```sql
        SELECT
          DATE_BIN(INTERVAL '2 hours', time, '1970-01-01T00:00:00Z'::TIMESTAMP) AS time
        FROM home
        ...
        ```
        
        Given a `time` value `2023-03-09T13:00:50.000Z`, the output `time` column contains `2023-03-09T12:00:00.000Z`.
        
    -   Use [aggregate](/influxdb3/clustered/reference/sql/functions/aggregate/) or [selector](/influxdb3/clustered/reference/sql/functions/selector/) functions on specified columns.
        
-   In your `GROUP BY` clause:
    
    -   Specify the `DATE_BIN(...)` column ordinal reference (`1`).
    -   Specify other columns (for example, `room`) that are specified in the `SELECT` clause and aren’t used in a selector function.
    
    ```sql
    SELECT
      DATE_BIN(INTERVAL '2 hours', time, '1970-01-01T00:00:00Z'::TIMESTAMP) AS time
    ...
    GROUP BY 1, room
    ...
    ```
    
    To reference the `DATE_BIN(...)` result column by *name* in the `GROUP BY` clause, assign an alias other than “time” in the `SELECT` clause–for example:
    
    ```sql
    SELECT
      DATE_BIN(INTERVAL '2 hours', time, '1970-01-01T00:00:00Z'::TIMESTAMP) AS _time
    FROM home
    ...
    GROUP BY _time, room
    ```
    
-   Include an `ORDER BY` clause with columns to sort by.
    

The following example retrieves unique combinations of time intervals and rooms with their minimum, maximum, and average temperatures.

```sql
SELECT
  DATE_BIN(INTERVAL '2 hours', time, '1970-01-01T00:00:00Z'::TIMESTAMP) AS time,
  room,
  selector_max(temp, time)['value'] AS 'max temp',
  selector_min(temp, time)['value'] AS 'min temp',
  avg(temp) AS 'average temp'
FROM home
GROUP BY 1, room
ORDER BY room, 1
```

[](#view-example-results)

View example results

| time | room | max temp | min temp | average temp |
| --- | --- | --- | --- | --- |
| 2022-01-01T08:00:00Z | Kitchen | 23 | 21 | 22 |
| 2022-01-01T10:00:00Z | Kitchen | 22.7 | 22.4 | 22.549999999999997 |
| 2022-01-01T12:00:00Z | Kitchen | 22.8 | 22.5 | 22.65 |
| 2022-01-01T14:00:00Z | Kitchen | 22.8 | 22.7 | 22.75 |
| 2022-01-01T16:00:00Z | Kitchen | 22.7 | 22.4 | 22.549999999999997 |
| 2022-01-01T18:00:00Z | Kitchen | 23.3 | 23.1 | 23.200000000000003 |
| 2022-01-01T20:00:00Z | Kitchen | 22.7 | 22.7 | 22.7 |
| 2022-01-01T08:00:00Z | Living Room | 21.4 | 21.1 | 21.25 |
| 2022-01-01T10:00:00Z | Living Room | 22.2 | 21.8 | 22 |
| 2022-01-01T12:00:00Z | Living Room | 22.4 | 22.2 | 22.299999999999997 |
| 2022-01-01T14:00:00Z | Living Room | 22.3 | 22.3 | 22.3 |
| 2022-01-01T16:00:00Z | Living Room | 22.6 | 22.4 | 22.5 |
| 2022-01-01T18:00:00Z | Living Room | 22.8 | 22.5 | 22.65 |
| 2022-01-01T20:00:00Z | Living Room | 22.2 | 22.2 | 22.2 |

#### GROUP BY time

In the `GROUP BY` clause, the name “time” always refers to the `time` column in the source table. If you want to reference a calculated time column by name, use an alias different from “time”–for example:

```sql
SELECT
  DATE_BIN(INTERVAL '2 hours', time, '1970-01-01T00:00:00Z'::TIMESTAMP)
  AS _time,
  room,
  selector_max(temp, time)['value'] AS 'max temp',
  selector_min(temp, time)['value'] AS 'min temp',
  avg(temp) AS 'average temp'
FROM home
GROUP BY _time, room
ORDER BY room, _time
```

### Query rows based on aggregate values

To query data based on values after an aggregate operation, include a `HAVING` clause with defined predicate conditions such as a value threshold. Predicates in the `WHERE` clause are applied *before* data is aggregated. Predicates in the `HAVING` clause are applied *after* data is aggregated.

```sql
SELECT
  room,
  avg(co) AS 'average co'
FROM home
GROUP BY room
HAVING "average co" > 5
```

[](#view-example-results)

View example results

| room | average co |
| --- | --- |
| Kitchen | 6.6923076923076925 |

#### Related

-   [SQL aggregate functions](/influxdb3/clustered/reference/sql/functions/aggregate/)
-   [SQL selector functions](/influxdb3/clustered/reference/sql/functions/selector/)

[query](/influxdb3/clustered/tags/query/) [sql](/influxdb3/clustered/tags/sql/)
