Optimize queries
Optimize SQL and InfluxQL queries to improve performance and reduce their memory and compute (CPU) requirements. Learn how to use observability tools to analyze query execution and view metrics.
Why is my query slow?
Query performance depends on factors like the time range and query complexity. If a query is slower than expected, consider the following potential causes:
- The query spans a large time range, which increases the amount of data being processed.
- The query performs intensive operations, such as:
- Sorting or re-sorting large datasets with
ORDER BY
. - Querying many string values, which can be computationally expensive.
- Sorting or re-sorting large datasets with
Strategies for improving query performance
The following design strategies generally improve query performance and resource usage:
- Follow schema design best practices to simplify and improve queries.
- Query only the data you need to reduce unnecessary processing.
- Downsample data to decrease the volume of data queried.
Query only the data you need
Include a WHERE clause
InfluxDB 3 stores data in a Parquet file for each partition.
By default, InfluxDB Clustered partitions tables by day, but you can also
custom-partition your data.
At query time, InfluxDB retrieves files from the Object store to answer a query.
To reduce the number of files that a query needs to retrieve from the Object store,
include a WHERE
clause that
filters data by a time range or by specific tag values.
SELECT only columns you need
Because InfluxDB 3 is a columnar database, it only processes the columns selected in a query, which can mitigate the query performance impact of wide schemas.
However, a non-specific query that retrieves a large number of columns from a wide schema can be slower and less efficient than a more targeted query–for example, consider the following queries:
SELECT time,a,b,c
SELECT *
If the table contains 10 columns, the difference in performance between the
two queries is minimal.
In a table with over 1000 columns, the SELECT *
query is slower and
less efficient.
Recognize and address bottlenecks
To identify performance bottlenecks, learn how to analyze a query plan. Query plans provide runtime metrics, such as the number of files scanned, that may reveal inefficiencies in query execution.
Request help to troubleshoot queries
Some bottlenecks may result from suboptimal query execution plans and are outside your control–for example:
- Sorting (
ORDER BY
) data that is already sorted. - Retrieving numerous small Parquet files from the object store instead of fewer, larger files.
- Querying many overlapped Parquet files.
- Performing a high number of table scans.
If you’ve followed steps to optimize and troubleshoot a query, but it still doesn’t meet performance requirements, see how to report query performance issues.
Query trace logging
Currently, customers cannot enable trace logging for InfluxDB clusters. InfluxData engineers can use query plans and trace logging to help pinpoint performance bottlenecks in a query.
See how to report query performance issues.
Was this page helpful?
Thank you for your feedback!
Support and feedback
Thank you for being part of our community! We welcome and encourage your feedback and bug reports for InfluxDB Clustered and this documentation. To find support, use the following resources:
Customers with an annual or support contract can contact InfluxData Support.