Documentation

Summarize query results and data distribution

Query data stored in InfluxDB and use tools like pandas to summarize the results schema and distribution.

Sample data

The following examples use the sample data written in the Get started writing data guide. To run the example queries and return results, write the sample data to your InfluxDB Cloud Serverless bucket before running the example queries.

View data information and statistics

Using Python and pandas

The following example uses the InfluxDB client library for Python to query an InfluxDB Cloud Serverless bucket, and then uses pandas DataFrame.info() and DataFrame.describe() methods to summarize the schema and distribution of the data.

  1. In your editor, create a file (for example, pandas-example.py) and enter the following sample code:

    # pandas-example.py
    
    import influxdb_client_3 as InfluxDBClient3
    import pandas
    
    client = InfluxDBClient3.InfluxDBClient3(token='
    API_TOKEN
    '
    ,
    host='cloud2.influxdata.com', database='
    BUCKET_NAME
    '
    ,
    org="", write_options=SYNCHRONOUS) table = client.query("select * from home where room like '%'") dataframe = table.to_pandas() # Print information about the results DataFrame, # including the index dtype and columns, non-null values, and memory usage. dataframe.info() # Calculate descriptive statistics that summarize the distribution of the results. print(dataframe.describe())
  2. Enter the following command in your terminal to execute the file using the Python interpreter:

    python pandas-example.py
    

    The output is similar to the following:

    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 411 entries, 0 to 410
    Data columns (total 8 columns):
    #   Column     Non-Null Count  Dtype         
    ---  ------     --------------  -----         
    0   co         405 non-null    float64       
    1   host       2 non-null      object        
    2   hum        406 non-null    float64       
    3   room       411 non-null    object        
    4   sensor     1 non-null      object        
    5   sensor_id  2 non-null      object        
    6   temp       411 non-null    float64       
    7   time       411 non-null    datetime64[ns]
    dtypes: datetime64[ns](1), float64(3), object(4)
    memory usage: 25.8+ KB
    
                  co         hum        temp                           time
    count  405.000000  406.000000  411.000000                            411
    mean     5.320988   35.860591   23.803893  2008-06-12 13:33:49.074302208
    min      0.000000   20.200000   18.400000     1970-01-01 00:00:01.641024
    25%      0.000000   35.900000   22.200000  1970-01-01 00:00:01.685054600
    50%      1.000000   36.000000   22.500000            2023-03-21 05:46:40
    75%      9.000000   36.300000   22.800000            2023-07-15 21:34:10
    max     26.000000   80.000000   74.000000            2023-07-17 02:07:00
    std      7.640154    3.318794    8.408807                            NaN
    

Was this page helpful?

Thank you for your feedback!


The future of Flux

Flux is going into maintenance mode. You can continue using it as you currently are without any changes to your code.

Read more

InfluxDB v3 enhancements and InfluxDB Clustered is now generally available

New capabilities, including faster query performance and management tooling advance the InfluxDB v3 product line. InfluxDB Clustered is now generally available.

InfluxDB v3 performance and features

The InfluxDB v3 product line has seen significant enhancements in query performance and has made new management tooling available. These enhancements include an operational dashboard to monitor the health of your InfluxDB cluster, single sign-on (SSO) support in InfluxDB Cloud Dedicated, and new management APIs for tokens and databases.

Learn about the new v3 enhancements


InfluxDB Clustered general availability

InfluxDB Clustered is now generally available and gives you the power of InfluxDB v3 in your self-managed stack.

Talk to us about InfluxDB Clustered

InfluxDB Cloud Serverless