---
title: Use pandas to analyze data
description: Use the pandas Python data analysis library to analyze and visualize time series data stored in InfluxDB Clustered.
url: https://docs.influxdata.com/influxdb3/clustered/process-data/tools/pandas/
estimated_tokens: 2034
publisher: InfluxData
canonical: https://docs.influxdata.com/influxdb3/clustered/process-data/tools/pandas/
date: '2026-05-15T15:46:14-06:00'
lastmod: '2026-05-15T15:46:14-06:00'
---

Use [pandas](https://pandas.pydata.org/), the Python data analysis library, to process, analyze, and visualize data
stored in an InfluxDB Clustered database.

>
>
> **pandas** is an open source, BSD-licensed library providing high-performance,
> easy-to-use data structures and data analysis tools for the Python programming language.
>
>
>
> [pandas documentation](https://pandas.pydata.org/docs/)
>
>

* [Install prerequisites](#install-prerequisites)
* [Install pandas](#install-pandas)
* [Use PyArrow to convert query results to pandas](#use-pyarrow-to-convert-query-results-to-pandas)
* [Use pandas to analyze data](#use-pandas-to-analyze-data)
  * [View data information and statistics](#view-data-information-and-statistics)
  * [Downsample time series](#downsample-time-series)

## Install prerequisites

The examples in this guide assume using a Python virtual environment and the InfluxDB 3 [`influxdb3-python` Python client library](/influxdb3/clustered/reference/client-libraries/v3/python/).
For more information, see how to [get started using Python to query InfluxDB](/influxdb3/clustered/query-data/execute-queries/client-libraries/python/).

Installing `influxdb3-python` also installs the [`pyarrow`](https://arrow.apache.org/docs/python/index.html) library that provides Python bindings for Apache Arrow.

## Install pandas

To use pandas, you need to install and import the `pandas` library.

In your terminal, use `pip` to install `pandas` in your active [Python virtual environment](/influxdb3/clustered/query-data/execute-queries/client-libraries/python/#create-a-project-virtual-environment):

```sh
pip install pandas
```

## Use PyArrow to convert query results to pandas

The following steps use Python, `influxdb3-python`, and `pyarrow` to query InfluxDB and stream Arrow data to a pandas `DataFrame`.

1. In your editor, copy and paste the following code to a new file–for example, `pandas-example.py`:

   ```
   # pandas-example.py

   from influxdb_client_3 import InfluxDBClient3
   import pandas

   # Instantiate an InfluxDB client configured for a database
   client = InfluxDBClient3(
     "https://cluster-host.com",
     database="DATABASE_NAME",
     token="DATABASE_TOKEN")

   # Execute the query to retrieve all record batches in the stream
   # formatted as a PyArrow Table.
   table = client.query(
     '''SELECT *
       FROM home
       WHERE time >= now() - INTERVAL '90 days'
       ORDER BY time'''
   )

   client.close()

   # Convert the PyArrow Table to a pandas DataFrame.
   dataframe = table.to_pandas()

   print(dataframe)
   ```

2. Replace the following configuration values:

   * `DATABASE_NAME`: the name of the [database](/influxdb3/clustered/admin/databases/) to query
   * `DATABASE_TOKEN`:
     a [database token](/influxdb3/clustered/admin/tokens/#database-tokens)with *read* permission on the specified database

3. In your terminal, use the Python interpreter to run the file:

   ```
   python pandas-example.py
   ```

The example calls the following methods:

* [`InfluxDBClient3.query()`](/influxdb3/clustered/reference/client-libraries/v3/python/#influxdbclient3query): sends the query request and returns a [`pyarrow.Table`](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html) that contains all the Arrow record batches from the response stream.

* [`pyarrow.Table.to_pandas()`](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.to_pandas): Creates a [`pandas.DataFrame`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html#pandas.DataFrame) from the data in the PyArrow `Table`.

[](#view-example-results)

View example results

```
    co   hum         room  temp                time
0    0  35.9  Living Room  21.1 2022-01-02 11:46:40
1    0  35.9      Kitchen  21.0 2022-01-02 11:46:40
2    0  36.2      Kitchen  23.0 2022-01-02 12:46:40
3    0  35.9  Living Room  21.4 2022-01-02 12:46:40
4    0  36.1      Kitchen  22.7 2022-01-02 13:46:40
5    0  36.0  Living Room  21.8 2022-01-02 13:46:40
6    0  36.0      Kitchen  22.4 2022-01-02 14:46:40
7    0  36.0  Living Room  22.2 2022-01-02 14:46:40
8    0  36.0      Kitchen  22.5 2022-01-02 15:46:40
9    0  35.9  Living Room  22.2 2022-01-02 15:46:40
10   1  36.5      Kitchen  22.8 2022-01-02 16:46:40
11   0  36.0  Living Room  22.4 2022-01-02 16:46:40
12   1  36.3      Kitchen  22.8 2022-01-02 17:46:40
13   0  36.1  Living Room  22.3 2022-01-02 17:46:40
14   3  36.2      Kitchen  22.7 2022-01-02 18:46:40
15   1  36.1  Living Room  22.3 2022-01-02 18:46:40
16   7  36.0      Kitchen  22.4 2022-01-02 19:46:40
17   4  36.0  Living Room  22.4 2022-01-02 19:46:40
18   9  36.0      Kitchen  22.7 2022-01-02 20:46:40
19   5  35.9  Living Room  22.6 2022-01-02 20:46:40
20  18  36.9      Kitchen  23.3 2022-01-02 21:46:40
21   9  36.2  Living Room  22.8 2022-01-02 21:46:40
22  22  36.6      Kitchen  23.1 2022-01-02 22:46:40
23  14  36.3  Living Room  22.5 2022-01-02 22:46:40
24  26  36.5      Kitchen  22.7 2022-01-02 23:46:40
25  17  36.4  Living Room  22.2 2022-01-02 23:46:40
```

Next, [use pandas to analyze data](#use-pandas-to-analyze-data).

## Use pandas to analyze data

* [View data information and statistics](#view-data-information-and-statistics)
* [Downsample time series](#downsample-time-series)

### View data information and statistics

The following example shows how to use pandas `DataFrame` methods to transform and summarize data stored in InfluxDB Clustered.

```py
# pandas-example.py

from influxdb_client_3 import InfluxDBClient3
import pandas

# Instantiate an InfluxDB client configured for a database
client = InfluxDBClient3(
  "https://cluster-host.com",
  database="DATABASE_NAME",
  token="DATABASE_TOKEN")

# Execute the query to retrieve all record batches in the stream
# formatted as a PyArrow Table.
table = client.query(
  '''SELECT *
    FROM home
    WHERE time >= now() - INTERVAL '90 days'
    ORDER BY time'''
)

client.close()

# Convert the PyArrow Table to a pandas DataFrame.
dataframe = table.to_pandas()

# Print information about the results DataFrame,
# including the index dtype and columns, non-null values, and memory usage.
dataframe.info()

# Calculate descriptive statistics that summarize the distribution of the results.
print(dataframe.describe())

# Extract a DataFrame column.
print(dataframe['temp'])

# Print the DataFrame in Markdown format.
print(dataframe.to_markdown())
```

Replace the following configuration values:

* `DATABASE_NAME`: the name of the InfluxDB [database](/influxdb3/clustered/admin/databases/) to query
* `DATABASE_TOKEN`:
  a [database token](/influxdb3/clustered/admin/tokens/#database-tokens)with read permission on the specified database

### Downsample time series

The pandas library provides extensive features for working with time series data.

The [`pandas.DataFrame.resample()` method](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html) downsamples and upsamples data to time-based groups–for example:

```py
# pandas-example.py

...

# Use the `time` column to generate a DatetimeIndex for the DataFrame
dataframe = dataframe.set_index('time')

# Print information about the index
print(dataframe.index)

# Downsample data into 1-hour groups based on the DatetimeIndex
resample = dataframe.resample("1H")

# Print a summary that shows the start time and average temp for each group
print(resample['temp'].mean())
```

[](#view-example-results)

View example results

```sh
time
2023-07-16 22:00:00          NaN
2023-07-16 23:00:00    22.600000
2023-07-17 00:00:00    22.513889
2023-07-17 01:00:00    22.208333
2023-07-17 02:00:00    22.300000
...
Freq: H, Name: temp, Length: 469323, dtype: float64
```

For more detail and examples, see the [pandas documentation](https://pandas.pydata.org/docs/index.html).

#### Related

* [Use Python to query data](/influxdb3/clustered/query-data/execute-queries/client-libraries/python/)

[analysis](/influxdb3/clustered/tags/analysis/)[pandas](/influxdb3/clustered/tags/pandas/)[pyarrow](/influxdb3/clustered/tags/pyarrow/)[python](/influxdb3/clustered/tags/python/)
