Documentation

Configure specialized cluster nodes

Optimize performance for specific workloads in your InfluxDB 3 Enterprise cluster by configuring specialized nodes in distributed deployments. Assign specific modes and thread allocations to nodes to maximize cluster efficiency.

Specialize nodes for specific workloads

In an InfluxDB 3 Enterprise cluster, you can dedicate nodes to specific tasks:

  • Ingest nodes: Optimized for high-throughput data ingestion
  • Query nodes: Maximized for complex analytical queries
  • Compactor nodes: Dedicated to data compaction and optimization
  • Process-capable nodes: Any node with --plugin-dir configured can execute Processing Engine plugins. Use --node-spec when creating a trigger to pin its execution to specific nodes.
  • All-in-one nodes: Balanced for mixed workloads (single-node deployments only)

Configure node modes

Pass the --mode parameter when starting the node to specify its capabilities:

# Single mode
influxdb3 serve --mode=ingest

# Multiple modes
influxdb3 serve --mode=ingest,query

# All modes (default, for single-node Enterprise only)
influxdb3 serve --mode=all

Available modes:

  • all: All capabilities enabled (single-node Enterprise deployments only)
  • ingest: Data ingestion and line protocol parsing
  • query: Query execution and data retrieval
  • compact: Background compaction and optimization
  • process: Activates the Processing Engine. process has no API surface of its own — it activates the Python virtual machine that runs trigger plugins. Setting --plugin-dir implies process mode, so you rarely need to set process explicitly. In a multi-node cluster, combine process with another mode (typically query, so plugins can call influxdb3_local.query() against the local engine) — see Configure process-capable nodes.

Don’t use all mode in a multi-node cluster

Don’t use all mode in a multi-node cluster

Use all mode for single-node Enterprise deployments only. Some cluster features such as replication and catalog refresh aren’t designed to work with all-mode nodes. In a multi-node cluster, use explicit modes (ingest, query, compact, process) and assign compact to exactly one node.

Allocate threads by node type

Critical concept: Thread pools

Every node has two thread pools that must be properly configured:

  1. IO threads: Parse line protocol, handle HTTP requests
  2. DataFusion threads: Execute queries, create data snapshots (convert WAL data to Parquet files), perform compaction

Even specialized nodes need both thread types. Ingest nodes use DataFusion threads for creating data snapshots that convert WAL data to Parquet files, and query nodes use IO threads for handling requests.

Configure ingest nodes

Ingest nodes handle high-volume data writes and require significant IO thread allocation for line protocol parsing.

Example medium ingester (32 cores)

influxdb3 \
  --num-io-threads=12 \
  serve \
  --num-cores=32 \
  --datafusion-num-threads=20 \
  --exec-mem-pool-bytes=60% \
  --mode=ingest \
  --node-id=ingester-01

Configuration rationale:

  • 12 IO threads: Handle multiple concurrent writers (Telegraf agents, applications)
  • 20 DataFusion threads: Required for data snapshot operations that convert buffered writes to Parquet files
  • 60% memory pool: Balance between write buffers and data snapshot operations

Monitor ingest performance

Key metrics for ingest nodes:

# Monitor IO thread utilization
top -H -p $(pgrep influxdb3) | grep io_worker

# Check write request counts by endpoint
curl -s http://localhost:8181/metrics | grep 'http_requests_total.*write'

# Check overall HTTP request metrics
curl -s http://localhost:8181/metrics | grep 'http_requests_total'

# Monitor WAL size
du -sh /path/to/data/wal/

Scale IO threads with concurrent writers

If you see only 2 CPU cores at 100% on a large ingester, increase --num-io-threads. Each concurrent writer can utilize approximately one IO thread.

Configure query nodes

Query nodes execute complex analytical queries and need maximum DataFusion threads.

Analytical query node (64 cores)

influxdb3 \
  --num-io-threads=4 \
  serve \
  --num-cores=64 \
  --datafusion-num-threads=60 \
  --exec-mem-pool-bytes=90% \
  --parquet-mem-cache-size=8GB \
  --mode=query \
  --node-id=query-01 \
  --cluster-id=prod-cluster

Configuration rationale:

  • 4 IO threads: Minimal, just for HTTP request handling
  • 60 DataFusion threads: Maximum parallelism for query execution
  • 90% memory pool: Maximize memory for complex aggregations
  • 8 GB Parquet cache: Keep frequently accessed data in memory

Real-time query node (32 cores)

influxdb3 \
  --num-io-threads=6 \
  serve \
  --num-cores=32 \
  --datafusion-num-threads=26 \
  --exec-mem-pool-bytes=80% \
  --parquet-mem-cache-size=4GB \
  --mode=query \
  --node-id=query-02

Optimize query settings

You can configure datafusion properties for additional tuning of query nodes:

influxdb3 serve \
  --datafusion-config "datafusion.execution.batch_size:16384,datafusion.execution.target_partitions:60" \
  --mode=query

Configure compactor nodes

Compactor nodes optimize stored data through background compaction processes.

Only one compactor node can run per cluster. Multiple compactors writing compacted data to the same location will cause data corruption. Any node mode that includes compaction (compact or all) counts toward this limit.

Dedicated compactor (32 cores)

influxdb3 \
  --num-io-threads=2 \
  serve \
  --num-cores=32 \
  --datafusion-num-threads=30 \
  --compaction-gen2-duration=24h \
  --compaction-check-interval=5m \
  --mode=compact \
  --node-id=compactor-01 \
  --cluster-id=prod-cluster

# Note: --compaction-row-limit option is not yet released in v3.5.0
# Uncomment when available in a future release:
# --compaction-row-limit=2000000 \

Configuration rationale:

  • 2 IO threads: Minimal, compaction is DataFusion-intensive
  • 30 DataFusion threads: Maximum threads for sort/merge operations
  • 24h gen2 duration: Time-based compaction strategy

Tune compaction parameters

You can adjust compaction strategies to balance performance and resource usage:

# Configure compaction strategy
--compaction-multipliers=4,8,16 \
--compaction-max-num-files-per-plan=100 \
--compaction-cleanup-wait=10m

Configure process-capable nodes

Any node with --plugin-dir configured can execute Processing Engine plugins. Setting --plugin-dir implicitly adds process mode regardless of the node’s other modes; explicit --mode=process requires --plugin-dir to be set.

Configure --plugin-dir on every cluster node

The Enterprise catalog registers triggers cluster-wide. Every node validates the registered triggers at startup, even nodes that don’t execute them — for example, ingest-only and compact-only nodes. If a plugin file referenced by a registered trigger is missing on a node, the engine panics on startup.

Configure --plugin-dir on every node and make the same plugin files available to each one (for example, by mounting a shared directory in your container or pod spec). Use --node-spec on each trigger to control which nodes actually execute it.

Enable the Processing Engine on any node

influxdb3 \
  --num-io-threads=4 \
  serve \
  --num-cores=16 \
  --datafusion-num-threads=12 \
  --plugin-dir=/path/to/plugins \
  --node-id=hybrid-01 \
  --cluster-id=prod-cluster

Process + query node (16 cores)

The recommended pattern for a node that hosts schedule plugins. Combining process with query lets plugins call influxdb3_local.query() against the local engine without an extra network hop:

influxdb3 \
  --num-io-threads=4 \
  serve \
  --num-cores=16 \
  --datafusion-num-threads=12 \
  --plugin-dir=/path/to/plugins \
  --mode=process,query \
  --node-id=processor-01 \
  --cluster-id=prod-cluster

A node in process,query mode doesn’t accept writes locally. Schedule plugins running on it that need to write results back to the cluster must POST line protocol to an ingest node.

Cross-node write-back example

The influxdb3-ref-network-telemetry reference architecture’s plugins/_writeback.py helper round-robins writes across configured ingest URLs with one fallback hop on connection error.

Multi-mode configurations

Some deployments benefit from nodes handling multiple responsibilities.

Ingest + Query node (48 cores)

influxdb3 \
  --num-io-threads=12 \
  serve \
  --num-cores=48 \
  --datafusion-num-threads=36 \
  --exec-mem-pool-bytes=75% \
  --mode=ingest,query \
  --node-id=hybrid-01

Query + Compact node (32 cores)

influxdb3 \
  --num-io-threads=4 \
  serve \
  --num-cores=32 \
  --datafusion-num-threads=28 \
  --mode=query,compact \
  --node-id=qc-01

Cluster architecture examples

Small cluster (3 nodes)

Only one node per cluster can run compaction. In this example, Node 1 handles ingest, query, and compaction while Nodes 2–3 handle ingest and query only.

# Node 1: Ingest, query, and compactor
mode: ingest,query,compact
cores: 32
io_threads: 8
datafusion_threads: 24
---
# Node 2: Ingest and query (no compaction)
mode: ingest,query
cores: 32
io_threads: 8
datafusion_threads: 24
---
# Node 3: Ingest and query (no compaction)
mode: ingest,query
cores: 32
io_threads: 8
datafusion_threads: 24

Medium cluster (6 nodes)

# Nodes 1-2: Ingesters
mode: ingest
cores: 48
io_threads: 16
datafusion_threads: 32
---
# Nodes 3-4: Query nodes
mode: query
cores: 48
io_threads: 4
datafusion_threads: 44
---
# Node 5: Compactor (only one compactor per cluster)
mode: compact
cores: 32
io_threads: 4
datafusion_threads: 28
---
# Node 6: Process node
mode: process
cores: 32
io_threads: 4
datafusion_threads: 28

Large cluster (12+ nodes)

# Nodes 1-4: High-throughput ingesters
mode: ingest
cores: 96
io_threads: 20
datafusion_threads: 76
---
# Nodes 5-8: Query nodes
mode: query
cores: 64
io_threads: 4
datafusion_threads: 60
---
# Node 9: Dedicated compactor (only one compactor per cluster)
mode: compact
cores: 32
io_threads: 2
datafusion_threads: 30
---
# Nodes 10-12: Process nodes
mode: process
cores: 32
io_threads: 6
datafusion_threads: 26

Scale your cluster

Vertical scaling limitations

InfluxDB 3 Enterprise uses a shared-nothing architecture where ingest nodes handle all writes. To maximize ingest performance:

  • Scale IO threads with concurrent writers: Each concurrent writer can utilize approximately one IO thread for line protocol parsing
  • Use high-core machines: Line protocol parsing is CPU-intensive and benefits from more cores
  • Deploy multiple ingest nodes: Run several ingest nodes behind a load balancer to distribute write load
  • Optimize batch sizes: Configure clients to send larger batches to reduce per-request overhead

Scale queries horizontally

Query nodes can scale horizontally since they all access the same object store:

# Add query nodes as needed
for i in {1..10}; do
  influxdb3 \
    --num-io-threads=4 \
    serve \
    --num-cores=32 \
    --datafusion-num-threads=28 \
    --mode=query \
    --node-id=query-$i &
done

Monitor performance

Node-specific metrics

Monitor specialized nodes differently based on their role:

Ingest nodes

-- Monitor write activity through parquet file creation
SELECT
  table_name,
  count(*) as files_created,
  sum(row_count) as total_rows,
  sum(size_bytes) as total_bytes
FROM system.parquet_files
WHERE max_time > extract(epoch from now() - INTERVAL '5 minutes') * 1000000000
GROUP BY table_name;

Query nodes

-- Monitor query performance
SELECT
  count(*) as query_count,
  avg(execute_duration) as avg_execute_time,
  max(max_memory) as max_memory_bytes
FROM system.queries
WHERE issue_time > now() - INTERVAL '5 minutes'
  AND success = true;

Compactor nodes

-- Monitor compaction progress
SELECT
  event_type,
  event_status,
  count(*) as event_count,
  avg(event_duration) as avg_duration
FROM system.compaction_events
WHERE event_time > now() - INTERVAL '1 hour'
GROUP BY event_type, event_status
ORDER BY event_count DESC;

Monitor cluster-wide metrics

# Check node health via HTTP endpoints
for node in ingester-01:8181 query-01:8181 compactor-01:8181; do
  echo "Node: $node"
  curl -s "http://$node/health"
done

# Monitor metrics from each node
for node in ingester-01:8181 query-01:8181 compactor-01:8181; do
  echo "=== Metrics from $node ==="
  curl -s "http://$node/metrics" | grep -E "(cpu_usage|memory_usage|http_requests_total)"
done

# Query system tables for cluster-wide monitoring
curl -X POST "http://query-01:8181/api/v3/query_sql" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "q": "SELECT * FROM system.queries WHERE issue_time > now() - INTERVAL '\''5 minutes'\'' ORDER BY issue_time DESC LIMIT 10",
    "db": "sensors"
  }'

Extend monitoring with plugins

Enhance your cluster monitoring capabilities using the InfluxDB 3 processing engine. The InfluxDB 3 plugins library includes several monitoring and alerting plugins:

  • System metrics collection: Collect CPU, memory, disk, and network statistics
  • Threshold monitoring: Monitor metrics with configurable thresholds and alerting
  • Multi-channel notifications: Send alerts via Slack, Discord, SMS, WhatsApp, and webhooks
  • Anomaly detection: Identify unusual patterns in your data
  • Deadman checks: Detect missing data streams

For complete plugin documentation and setup instructions, see Process data in InfluxDB 3 Enterprise.

Monitor and respond to performance issues

Use the monitoring queries to identify the following patterns and their solutions:

High CPU with low throughput (Ingest nodes)

Detection query:

-- Check for high failed query rate indicating parsing issues
SELECT
  count(*) as total_queries,
  sum(CASE WHEN success = true THEN 1 ELSE 0 END) as successful_queries,
  sum(CASE WHEN success = false THEN 1 ELSE 0 END) as failed_queries
FROM system.queries
WHERE issue_time > now() - INTERVAL '5 minutes';

Symptoms:

  • Only 2 CPU cores at 100% on large machines
  • High write latency despite available resources
  • Failed queries due to parsing timeouts

Solution: Increase IO threads (see Ingest node issues)

Memory pressure alerts (Query nodes)

Detection query:

-- Monitor queries with high memory usage or failures
SELECT
  avg(max_memory) as avg_memory_bytes,
  max(max_memory) as peak_memory_bytes,
  sum(CASE WHEN success = false THEN 1 ELSE 0 END) as failed_queries
FROM system.queries
WHERE issue_time > now() - INTERVAL '5 minutes'
  AND query_type = 'sql';

Symptoms:

  • Queries failing with out-of-memory errors
  • High memory usage approaching pool limits
  • Slow query execution times

Solution: Increase memory pool or optimize queries (see Query node issues)

Compaction falling behind (Compactor nodes)

Detection query:

-- Check compaction event frequency and success rate
SELECT
  event_type,
  count(*) as event_count,
  sum(CASE WHEN event_status = 'success' THEN 1 ELSE 0 END) as successful_events
FROM system.compaction_events
WHERE event_time > now() - INTERVAL '1 hour'
GROUP BY event_type;

Symptoms:

  • Decreasing compaction event frequency
  • Growing number of small Parquet files
  • Increasing query times due to file fragmentation

Solution: For nodes using the Parquet-backed storage engine, increase DataFusion threads on your single compactor node (see Compactor node issues). The Performance Preview with PachaTree storage does not use DataFusion for compaction—refer to the Performance Preview documentation for tuning guidance.

Troubleshoot node configurations

Ingest node issues

Problem: Low throughput despite available CPU

# Check: Are only 2 cores busy?
top -H -p $(pgrep influxdb3)

# Solution: Increase IO threads
--num-io-threads=16

Problem: Data snapshot creation affecting ingest

# Check: DataFusion threads at 100% during data snapshots to Parquet
# Solution: Reserve more DataFusion threads for snapshot operations
--datafusion-num-threads=40

Query node issues

Problem: Slow queries despite resources

# Check: Memory pressure
free -h

# Solution: Increase memory pool
--exec-mem-pool-bytes=90%

Problem: Poor cache hit rates

# Solution: Increase Parquet cache
--parquet-mem-cache-size=10GB

Compactor node issues

Problem: Compaction falling behind

# Check: Compaction queue length
# Solution: Increase threads on the single compactor node (only one compactor is allowed per cluster)
--datafusion-num-threads=30

Best practices

  1. Start with monitoring: Understand bottlenecks before specializing nodes
  2. Test mode combinations: Some workloads benefit from multi-mode nodes
  3. Plan for failure: Ensure redundancy in critical node types
  4. Document your topology: Keep clear records of node configurations
  5. Regular rebalancing: Adjust thread allocation as workloads evolve
  6. Capacity planning: Monitor trends and scale proactively

Migrate to specialized nodes

From single-node to specialized cluster

all mode is only for single-node Enterprise deployments.

When scaling a single all node cluster to a multi-node cluster:

  • Replace the all node with nodes that have explicit, specialized modes
  • Assign compact mode to exactly one node that uses the same node-id as the all node being replaced
# Phase 1: Single-node deployment
influxdb3 serve \
  --node-id=node0 \
  --cluster-id=my-cluster \
  --mode=all \
  --num-io-threads=8

# Phase 2: Scale to multi-node cluster
# Stop the all-mode node and start specialized nodes.
# The compact node MUST use the same --node-id as the replaced all-mode node.

# Compact node: reuses the same node-id as the replaced all-mode node
influxdb3 serve \
  --node-id=node0 \
  --cluster-id=my-cluster \
  --mode=compact

# Ingest and query node
influxdb3 serve \
  --node-id=node1 \
  --cluster-id=my-cluster \
  --mode=ingest,query \
  --num-io-threads=8

# Phase 3: Full specialization (optional)
# Dedicated ingest node
influxdb3 serve \
  --node-id=node1 \
  --cluster-id=my-cluster \
  --mode=ingest \
  --num-io-threads=16

# Dedicated query node
influxdb3 serve \
  --node-id=node2 \
  --cluster-id=my-cluster \
  --mode=query \
  --num-io-threads=4

Manage configurations

Configure using environment variables

# Set environment variables for node type
export INFLUXDB3_ENTERPRISE_MODE=ingest
export INFLUXDB3_NUM_IO_THREADS=20
export INFLUXDB3_DATAFUSION_NUM_THREADS=76

influxdb3 serve --node-id=$HOSTNAME --cluster-id=prod

Was this page helpful?

Thank you for your feedback!


InfluxDB OSS 2.9.0: API tokens are hashed by default

Stronger token security in InfluxDB OSS 2.9.0 — tokens are hashed on disk by default. Existing tokens are hashed on first startup and can’t be recovered afterward. Capture any plaintext tokens you still need before you upgrade.

View InfluxDB OSS 2.9.0 release notes

Hashed tokens authenticate exactly like unhashed tokens — clients and integrations keep working.

Also new in 2.9.0:

  • Configurable backup compression
  • Restore support for backups containing hashed tokens
  • Tighter Edge Data Replication queue validation
  • Flux upgrade
  • Compaction reliability improvements

Key enhancements in Explorer 1.8

Explorer 1.8 is now available with streaming data subscriptions (beta), line protocol preview, and query history & saved queries.

View Explorer 1.8 release notes

Explorer 1.8 includes new features and improvements that make it easier to ingest, explore, and manage data.

Highlights:

  • Streaming data subscriptions (beta): Stream data into Explorer from MQTT, Kafka, and AMQP sources.
  • Line protocol preview: Preview line protocol, schema, and parse errors before data is written.
  • Custom sample data: Generate custom sample datasets with line protocol and schema preview.
  • Query history and saved queries: Browse query history and save/re-run named queries.
  • Retention period management: Set, update, or clear retention periods on databases and tables.

For more details, see Explorer 1.8 release notes

InfluxDB 3.9: Performance upgrade preview

InfluxDB 3 Enterprise 3.9 includes a beta of major performance upgrades with faster single-series queries, wide-and-sparse table support, and more.

InfluxDB 3 Enterprise 3.9 includes a beta of major performance and feature updates.

Key improvements:

  • Faster single-series queries
  • Consistent resource usage
  • Wide-and-sparse table support
  • Automatic distinct value caches for reduced latency with metadata queries

Preview features are subject to breaking changes.

For more information, see:

Telegraf Enterprise now in public beta

Get early access to the Telegraf Controller and provide feedback to help shape the future of Telegraf Enterprise.

See the Blog Post

The upcoming Telegraf Enterprise offering is for organizations running Telegraf at scale and is comprised of two key components:

  • Telegraf Controller: A control plane (UI + API) that centralizes Telegraf configuration management and agent health visibility.
  • Telegraf Enterprise Support: Official support for Telegraf Controller and Telegraf plugins.

Join the Telegraf Enterprise beta to get early access to the Telegraf Controller and provide feedback to help shape the future of Telegraf Enterprise.

For more information:

Telegraf Controller v0.0.7-beta now available

Telegraf Controller v0.0.7-beta is now available with new features, improvements, bug fixes, and an important breaking change.

View the release notes
Download Telegraf Controller v0.0.7-beta

InfluxDB Docker latest tag changing to InfluxDB 3 Core

On May 27, 2026, the latest tag for InfluxDB Docker images will point to InfluxDB 3 Core. To avoid unexpected upgrades, use specific version tags in your Docker deployments.

If using Docker to install and run InfluxDB, the latest tag will point to InfluxDB 3 Core. To avoid unexpected upgrades, use specific version tags in your Docker deployments. For example, if using Docker to run InfluxDB v2, replace the latest version tag with a specific version tag in your Docker pull command–for example:

docker pull influxdb:2