---
title: Use the PyArrow library to analyze data
description: Use PyArrow to read and analyze InfluxDB query results from InfluxDB Cloud Serverless.
url: https://docs.influxdata.com/influxdb3/cloud-serverless/process-data/tools/pyarrow/
estimated_tokens: 1240
publisher: InfluxData
canonical: https://docs.influxdata.com/influxdb3/cloud-serverless/process-data/tools/pyarrow/
date: '2026-05-15T15:46:14-06:00'
lastmod: '2026-05-15T15:46:14-06:00'
---

Use [PyArrow](https://arrow.apache.org/docs/python/) to read and analyze query
results from InfluxDB Cloud Serverless.
The PyArrow library provides efficient computation, aggregation, serialization,
and conversion of Arrow format data.

>
>
> Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable
> big data systems to store, process and move data fast.
>
>
>
> The Arrow Python bindings (also named “PyArrow”) have first-class integration with NumPy, pandas, and built-in Python objects. They are based on the C++ implementation of Arrow.
>
>
>
> [PyArrow documentation](https://arrow.apache.org/docs/python/index.html)
>
>
>
>
>
>

* [Install prerequisites](#install-prerequisites)
* [Use PyArrow to read query results](#use-pyarrow-to-read-query-results)
* [Use PyArrow to analyze data](#use-pyarrow-to-analyze-data)
  * [Group and aggregate data](#group-and-aggregate-data)

## Install prerequisites

The examples in this guide assume using a Python virtual environment and the InfluxDB 3 [`influxdb3-python` Python client library](/influxdb3/cloud-serverless/reference/client-libraries/v3/python/).
For more information, see how to [get started using Python to query InfluxDB](/influxdb3/cloud-serverless/query-data/execute-queries/flight-sql/python/).

Installing `influxdb3-python` also installs the [`pyarrow`](https://arrow.apache.org/docs/python/index.html) library that provides Python bindings for Apache Arrow.

## Use PyArrow to read query results

The following example shows how to use `influxdb3-python` and `pyarrow` to query InfluxDB and view Arrow data as a PyArrow `Table`.

1. In your editor, copy and paste the following sample code to a new file–for example, `pyarrow-example.py`:

   ```
   # pyarrow-example.py

   from influxdb_client_3 import InfluxDBClient3
   import pandas

   def querySQL():

     # Instantiate an InfluxDB client configured for a bucket
     client = InfluxDBClient3(
       "https://cloud2.influxdata.com",
       database="BUCKET_NAME",
       token="API_TOKEN")

     # Execute the query to retrieve all record batches in the stream formatted as a PyArrow Table.
     table = client.query(
       '''SELECT *
         FROM home
         WHERE time >= now() - INTERVAL '90 days'
         ORDER BY time'''
     )

     client.close()

   print(querySQL())
   ```

2. Replace the following configuration values:

   * `API_TOKEN`: An InfluxDB [token](/influxdb3/cloud-serverless/admin/tokens/) with read permissions on the buckets you want to query.
   * `BUCKET_NAME`: The name of the InfluxDB [bucket](/influxdb3/cloud-serverless/admin/buckets/) to query.

3. In your terminal, use the Python interpreter to run the file:

   ```
   python pyarrow-example.py
   ```

The `InfluxDBClient3.query()` method sends the query request, and then returns a [`pyarrow.Table`](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html) that contains all the Arrow record batches from the response stream.

Next, [use PyArrow to analyze data](#use-pyarrow-to-analyze-data).

## Use PyArrow to analyze data

### Group and aggregate data

With a `pyarrow.Table`, you can use values in a column as *keys* for grouping.

The following example shows how to query InfluxDB, and then use PyArrow to group the table data and calculate an aggregate value for each group:

```
# pyarrow-example.py

from influxdb_client_3 import InfluxDBClient3
import pandas

def querySQL():

  # Instantiate an InfluxDB client configured for a bucket
  client = InfluxDBClient3(
    "https://cloud2.influxdata.com",
    database="BUCKET_NAME",
    token="API_TOKEN")

  # Execute the query to retrieve data
  # formatted as a PyArrow Table
  table = client.query(
    '''SELECT *
      FROM home
      WHERE time >= now() - INTERVAL '90 days'
      ORDER BY time'''
  )

  client.close()

  return table

table = querySQL()

# Use PyArrow to aggregate data
print(table.group_by('room').aggregate([('temp', 'mean')]))
```

Replace the following:

* `API_TOKEN`: An InfluxDB [token](/influxdb3/cloud-serverless/admin/tokens/) with read permissions on the buckets you want to query.
* `BUCKET_NAME`: The name of the InfluxDB [bucket](/influxdb3/cloud-serverless/admin/tokens/) to query.

[](#view-example-results)

View example results

```
pyarrow.Table
temp_mean: double
## room: string
temp_mean: [[22.581987577639747,22.10807453416151]]
room: [["Kitchen","Living Room"]]
```

For more detail and examples, see the [PyArrow documentation](https://arrow.apache.org/docs/python/getstarted.html) and the [Apache Arrow Python Cookbook](https://arrow.apache.org/cookbook/py/data.html).

#### Related

* [Use pandas to analyze and visualize data](/influxdb3/cloud-serverless/process-data/tools/pandas/)
* [Query data with SQL](/influxdb3/cloud-serverless/query-data/sql/)
* [Use Python to query data](/influxdb3/cloud-serverless/query-data/execute-queries/client-libraries/python/)
