Documentation

System metrics plugin

The System Metrics Plugin provides comprehensive system monitoring capabilities for InfluxDB 3, collecting CPU, memory, disk, and network metrics from the host system. Monitor detailed performance insights including per-core CPU statistics, memory usage breakdowns, disk I/O performance, and network interface statistics. Features configurable metric collection with robust error handling and retry logic for reliable monitoring.

Configuration

Required parameters

No required parameters - all system metrics are collected by default with sensible defaults.

System monitoring parameters

ParameterTypeDefaultDescription
hostnamestringlocalhostHostname to tag all metrics with for system identification
include_cpubooleantrueInclude comprehensive CPU metrics collection (overall and per-core statistics)
include_memorybooleantrueInclude memory metrics collection (RAM usage, swap statistics, page faults)
include_diskbooleantrueInclude disk metrics collection (partition usage, I/O statistics, performance)
include_networkbooleantrueInclude network metrics collection (interface statistics and error counts)
max_retriesinteger3Maximum retry attempts on failure with graceful error handling

TOML configuration

ParameterTypeDefaultDescription
config_file_pathstringnoneTOML config file path relative to PLUGIN_DIR (required for TOML configuration)

To use a TOML configuration file, set the PLUGIN_DIR environment variable and specify the config_file_path in the trigger arguments. This is in addition to the --plugin-dir flag when starting InfluxDB 3.

Example TOML configuration

system_metrics_config_scheduler.toml

For more information on using TOML configuration files, see the Using TOML Configuration Files section in the influxdb3_plugins /README.md.

Installation steps

  1. Start InfluxDB 3 Enterprise with the Processing Engine enabled (--plugin-dir /path/to/plugins)

  2. Install required Python packages:

    • psutil (for system metrics collection)
    influxdb3 install package psutil

Trigger setup

Basic scheduled trigger

Monitor system performance every 30 seconds:

influxdb3 create trigger \
  --database system_monitoring \
  --plugin-filename gh:influxdata/system_metrics/system_metrics.py \
  --trigger-spec "every:30s" \
  system_metrics_trigger

Custom configuration

Monitor specific metrics with custom hostname:

influxdb3 create trigger \
  --database system_monitoring \
  --plugin-filename gh:influxdata/system_metrics/system_metrics.py \
  --trigger-spec "every:30s" \
  --trigger-arguments hostname=web-server-01,include_disk=false,max_retries=5 \
  system_metrics_custom_trigger

Example usage

Example 1: Web server monitoring

Monitor web server performance every 15 seconds with network statistics:

# Create trigger for web server monitoring
influxdb3 create trigger \
  --database web_monitoring \
  --plugin-filename gh:influxdata/system_metrics/system_metrics.py \
  --trigger-spec "every:15s" \
  --trigger-arguments hostname=web-server-01,include_network=true \
  web_server_metrics

# Query recent CPU metrics
influxdb3 query \
  --database web_monitoring \
  "SELECT * FROM system_cpu WHERE time >= now() - interval '5 minutes' LIMIT 5"

Expected output

+---------------+-------+------+--------+------+--------+-------+-------+-----------+------------------+
| host          | cpu   | user | system | idle | iowait | nice  | load1 | load5     | time             |
+---------------+-------+------+--------+------+--------+-------+-------+-----------+------------------+
| web-server-01 | total | 12.5 | 5.3    | 81.2 | 0.8    | 0.0   | 0.85  | 0.92      | 2024-01-15 10:00 |
| web-server-01 | total | 13.1 | 5.5    | 80.4 | 0.7    | 0.0   | 0.87  | 0.93      | 2024-01-15 10:01 |
| web-server-01 | total | 11.8 | 5.1    | 82.0 | 0.9    | 0.0   | 0.83  | 0.91      | 2024-01-15 10:02 |
+---------------+-------+------+--------+------+--------+-------+-------+-----------+------------------+

Example 2: Database server monitoring

Focus on CPU and disk metrics for database server:

# Create trigger for database server
influxdb3 create trigger \
  --database db_monitoring \
  --plugin-filename gh:influxdata/system_metrics/system_metrics.py \
  --trigger-spec "every:30s" \
  --trigger-arguments hostname=db-primary,include_disk=true,include_cpu=true,include_network=false \
  database_metrics

# Query disk usage
influxdb3 query \
  --database db_monitoring \
  "SELECT * FROM system_disk_usage WHERE host = 'db-primary'"

Example 3: High-frequency monitoring

Collect all metrics every 10 seconds with higher retry tolerance:

# Create high-frequency monitoring trigger
influxdb3 create trigger \
  --database system_monitoring \
  --plugin-filename gh:influxdata/system_metrics/system_metrics.py \
  --trigger-spec "every:10s" \
  --trigger-arguments hostname=critical-server,max_retries=10 \
  high_freq_metrics

Code overview

Files

  • system_metrics.py: The main plugin code containing system metrics collection logic
  • system_metrics_config_scheduler.toml: Example TOML configuration file for scheduled triggers

Logging

Logs are stored in the _internal database (or the database where the trigger is created) in the system.processing_engine_logs table. To view logs:

influxdb3 query --database _internal "SELECT * FROM system.processing_engine_logs WHERE trigger_name = 'your_trigger_name'"

Log columns:

  • event_time: Timestamp of the log event
  • trigger_name: Name of the trigger that generated the log
  • log_level: Severity level (INFO, WARN, ERROR)
  • log_text: Message describing the action or error

Main functions

process_scheduled_call(influxdb3_local, call_time, args)

The main entry point for scheduled triggers. Collects system metrics based on configuration and writes them to InfluxDB.

Key operations:

  1. Parses configuration from arguments
  2. Collects CPU, memory, disk, and network metrics based on configuration
  3. Writes metrics to InfluxDB with proper error handling and retry logic

collect_cpu_metrics(influxdb3_local, hostname)

Collects CPU utilization and performance metrics including per-core statistics and system load averages.

collect_memory_metrics(influxdb3_local, hostname)

Collects memory usage statistics including RAM, swap, and page fault information.

collect_disk_metrics(influxdb3_local, hostname)

Collects disk usage and I/O statistics for all mounted partitions.

collect_network_metrics(influxdb3_local, hostname)

Collects network interface statistics including bytes transferred and error counts.

Measurements and Fields

system_cpu

Overall CPU statistics and metrics:

  • Tags: host, cpu=total
  • Fields: user, system, idle, iowait, nice, irq, softirq, steal, guest, guest_nice, frequency_current, frequency_min, frequency_max, ctx_switches, interrupts, soft_interrupts, syscalls, load1, load5, load15

system_cpu_cores

Per-core CPU statistics:

  • Tags: host, core (core number)
  • Fields: usage, user, system, idle, iowait, nice, irq, softirq, steal, guest, guest_nice, frequency_current, frequency_min, frequency_max

system_memory

System memory statistics:

  • Tags: host
  • Fields: total, available, used, free, active, inactive, buffers, cached, shared, slab, percent

system_swap

Swap memory statistics:

  • Tags: host
  • Fields: total, used, free, percent, sin, sout

system_memory_faults

Memory page fault information (when available):

  • Tags: host
  • Fields: page_faults, major_faults, minor_faults, rss, vms, dirty, uss, pss

system_disk_usage

Disk partition usage:

  • Tags: host, device, mountpoint, fstype
  • Fields: total, used, free, percent

system_disk_io

Disk I/O statistics:

  • Tags: host, device
  • Fields: reads, writes, read_bytes, write_bytes, read_time, write_time, busy_time, read_merged_count, write_merged_count

system_disk_performance

Calculated disk performance metrics:

  • Tags: host, device
  • Fields: read_bytes_per_sec, write_bytes_per_sec, read_iops, write_iops, avg_read_latency_ms, avg_write_latency_ms, util_percent

system_network

Network interface statistics:

  • Tags: host, interface
  • Fields: bytes_sent, bytes_recv, packets_sent, packets_recv, errin, errout, dropin, dropout

Troubleshooting

Common issues

Issue: Permission errors on disk I/O metrics

Some disk I/O metrics may require elevated permissions.

Solution: The plugin will continue collecting other metrics even if some require elevated permissions. Consider running InfluxDB 3 with appropriate permissions if disk I/O metrics are critical.

Issue: Missing psutil library

ERROR: No module named 'psutil'

Solution: Install the psutil package:

influxdb3 install package psutil

Issue: High CPU usage from plugin

If the plugin causes high CPU usage, consider:

  • Increasing the trigger interval (for example, from every:10s to every:30s)
  • Disabling unnecessary metric types
  • Reducing the number of disk partitions monitored

Issue: No data being collected

Solution:

  1. Check that the trigger is active:
    influxdb3 query --database _internal "SELECT * FROM system.processing_engine_logs WHERE trigger_name = 'your_trigger_name'"
  2. Verify system permissions allow access to system metrics
  3. Check that the psutil package is properly installed

Debugging tips

  1. Check recent metrics collection:

    # List all system metric measurements
    influxdb3 query \
      --database system_monitoring \
      "SHOW MEASUREMENTS WHERE measurement =~ /^system_/"
    
    # Check recent CPU metrics
    influxdb3 query \
      --database system_monitoring \
      "SELECT COUNT(*) FROM system_cpu WHERE time >= now() - interval '1 hour'"
  2. Monitor plugin logs:

    influxdb3 query \
      --database _internal \
      "SELECT * FROM system.processing_engine_logs WHERE trigger_name = 'system_metrics_trigger' ORDER BY time DESC LIMIT 10"
  3. Test metric collection manually:

    influxdb3 test schedule_plugin \
      --database system_monitoring \
      --schedule "0 0 * * * ?" \
      system_metrics.py

Performance considerations

  • The plugin collects comprehensive system metrics efficiently using the psutil library
  • Metric collection is optimized to minimize system overhead
  • Error handling and retry logic ensure reliable operation
  • Configurable metric types allow focusing on relevant metrics only

Report an issue

For plugin issues, see the Plugins repository issues page.

Find support for InfluxDB 3 Enterprise

The InfluxDB Discord server is the best place to find support for InfluxDB 3 Core and InfluxDB 3 Enterprise. For other InfluxDB versions, see the Support and feedback options.


Was this page helpful?

Thank you for your feedback!


The future of Flux

Flux is going into maintenance mode. You can continue using it as you currently are without any changes to your code.

Read more

New in InfluxDB 3.4

Key enhancements in InfluxDB 3.4 and the InfluxDB 3 Explorer 1.2.

See the Blog Post

InfluxDB 3.4 is now available for both Core and Enterprise, which introduces offline token generation for use in automated deployments and configurable license type selection that lets you bypass the interactive license prompt. InfluxDB 3 Explorer 1.2 is also available, which includes InfluxDB cache management and other new features.

For more information, check out: