Documentation

Threshold deadman checks plugin

The Threshold Deadman Checks Plugin provides comprehensive monitoring capabilities for time series data in InfluxDB 3, combining real-time threshold detection with deadman monitoring. Monitor field values against configurable thresholds, detect data absence patterns, and trigger multi-level alerts based on aggregated metrics. Features both scheduled batch monitoring and real-time data write monitoring with configurable trigger counts and severity levels.

Configuration

Scheduled trigger parameters

ParameterTypeDefaultDescription
measurementstringrequiredMeasurement to monitor
sendersstringrequiredDot-separated notification channels
windowstringrequiredTime window for data checking

Data write trigger parameters

ParameterTypeDefaultDescription
measurementstringrequiredMeasurement to monitor for threshold conditions
field_conditionsstringrequiredThreshold conditions (for example, "temp>30-WARN:status==ok-INFO")
sendersstringrequiredDot-separated notification channels

Threshold check parameters

ParameterTypeDefaultDescription
field_aggregation_valuesstringnoneAggregation conditions for scheduled checks
deadman_checkbooleanfalseEnable deadman data presence checking
intervalstring“5min”Aggregation time interval
trigger_countnumber1Consecutive failures before alerting

Notification parameters

ParameterTypeDefaultDescription
influxdb3_auth_tokenstringenv varInfluxDB 3 API token
notification_deadman_textstringtemplateDeadman alert message template
notification_threshold_textstringtemplateThreshold alert message template
notification_textstringtemplateGeneral notification template (data write)
notification_pathstring“notify”Notification endpoint path
port_overridenumber8181InfluxDB port override
config_file_pathstringnoneTOML config file path relative to PLUGIN_DIR

Channel-specific configuration

Notification channels require additional parameters based on the sender type (same as the Notifier Plugin).

Schema requirements

The plugin assumes that the table schema is already defined in the database, as it relies on this schema to retrieve field and tag names required for processing.

Requires existing schema

By design, the plugin returns an error if the schema doesn’t exist or doesn’t contain the expected columns.

TOML configuration

ParameterTypeDefaultDescription
config_file_pathstringnoneTOML config file path relative to PLUGIN_DIR (required for TOML configuration)

To use a TOML configuration file, set the PLUGIN_DIR environment variable and specify the config_file_path in the trigger arguments. This is in addition to the --plugin-dir flag when starting InfluxDB 3.

Example TOML configuration files provided:

For more information on using TOML configuration files, see the Using TOML Configuration Files section in the influxdb3_plugins /README.md.

Installation

  1. Start InfluxDB 3 Enterprise with the Processing Engine enabled (--plugin-dir /path/to/plugins)

  2. Install required Python packages:

    • requests (for HTTP requests)
    influxdb3 install package requests

Create scheduled trigger

Create a trigger for periodic threshold and deadman checks:

influxdb3 create trigger \
  --database mydb \
  --plugin-filename threshold_deadman_checks_plugin.py \
  --trigger-spec "every:10m" \
  --trigger-arguments "measurement=cpu,senders=slack,field_aggregation_values=temp:avg@>=30-ERROR,window=10m,trigger_count=3,deadman_check=true,slack_webhook_url=https://hooks.slack.com/services/..." \
  threshold_scheduler

Create data write trigger

Create a trigger for real-time threshold monitoring:

influxdb3 create trigger \
  --database mydb \
  --plugin-filename threshold_deadman_checks_plugin.py \
  --trigger-spec "all_tables" \
  --trigger-arguments "measurement=cpu,field_conditions=temp>30-WARN:status==ok-INFO,senders=slack,trigger_count=2,slack_webhook_url=https://hooks.slack.com/services/..." \
  threshold_datawrite

Enable triggers

influxdb3 enable trigger --database mydb threshold_scheduler
influxdb3 enable trigger --database mydb threshold_datawrite

Examples

Deadman monitoring

Monitor for data absence and alert when no data is received:

influxdb3 create trigger \
  --database sensors \
  --plugin-filename threshold_deadman_checks_plugin.py \
  --trigger-spec "every:15m" \
  --trigger-arguments "measurement=heartbeat,senders=sms,window=10m,deadman_check=true,trigger_count=2,twilio_from_number=+1234567890,twilio_to_number=+0987654321,notification_deadman_text=CRITICAL: No heartbeat data from \$table between \$time_from and \$time_to" \
  heartbeat_monitor

Multi-level threshold monitoring

Monitor aggregated values with different severity levels:

influxdb3 create trigger \
  --database monitoring \
  --plugin-filename threshold_deadman_checks_plugin.py \
  --trigger-spec "every:5m" \
  --trigger-arguments "measurement=system_metrics,senders=slack.discord,field_aggregation_values=cpu_usage:avg@>=80-WARN\$cpu_usage:avg@>=95-ERROR\$memory_usage:max@>=90-WARN,window=5m,interval=1min,trigger_count=3,slack_webhook_url=https://hooks.slack.com/services/...,discord_webhook_url=https://discord.com/api/webhooks/..." \
  system_threshold_monitor

Real-time field condition monitoring

Monitor data writes for immediate threshold violations:

influxdb3 create trigger \
  --database applications \
  --plugin-filename threshold_deadman_checks_plugin.py \
  --trigger-spec "all_tables" \
  --trigger-arguments "measurement=response_times,field_conditions=latency>500-WARN:latency>1000-ERROR:error_rate>0.05-CRITICAL,senders=http,trigger_count=1,http_webhook_url=https://alertmanager.example.com/webhook,notification_text=[\$level] Application alert: \$field \$op_sym \$compare_val (actual: \$actual)" \
  app_performance_monitor

Combined monitoring

Monitor both aggregation thresholds and deadman conditions:

influxdb3 create trigger \
  --database comprehensive \
  --plugin-filename threshold_deadman_checks_plugin.py \
  --trigger-spec "every:10m" \
  --trigger-arguments "measurement=temperature_sensors,senders=whatsapp,field_aggregation_values=temperature:avg@>=35-WARN\$temperature:max@>=40-ERROR,window=15m,deadman_check=true,trigger_count=2,twilio_from_number=+1234567890,twilio_to_number=+0987654321" \
  comprehensive_sensor_monitor

Features

  • Dual monitoring modes: Scheduled aggregation checks and real-time data write monitoring
  • Deadman detection: Monitor for data absence and missing data streams
  • Multi-level alerting: Support for INFO, WARN, ERROR, and CRITICAL severity levels
  • Aggregation support: Monitor avg, min, max, count, sum, derivative, and median values
  • Configurable triggers: Require multiple consecutive failures before alerting
  • Multi-channel notifications: Integration with various notification systems
  • Template messages: Customizable alert templates with dynamic variables
  • Performance optimization: Measurement and tag caching for improved efficiency

Troubleshooting

Common issues

No alerts triggered

  • Verify threshold values are appropriate for your data ranges
  • Check that notification channels are properly configured
  • Ensure the Notifier Plugin is installed and accessible
  • Review plugin logs for configuration errors

False positive alerts

  • Increase trigger_count to require more consecutive failures
  • Adjust threshold values to be less sensitive
  • Consider longer aggregation intervals for noisy data

Missing deadman alerts

  • Verify deadman_check=true is set in configuration
  • Check that the measurement name matches existing data
  • Ensure the time window is appropriate for your data frequency

Authentication issues

  • Set INFLUXDB3_AUTH_TOKEN environment variable
  • Verify API token has required database permissions
  • Check Twilio credentials for SMS/WhatsApp notifications

Configuration formats

Aggregation conditions (scheduled)

  • Format: field:aggregation@"operator value-level"
  • Example: temp:avg@">=30-ERROR"
  • Multiple conditions: temp:avg@">=30-WARN"$humidity:min@"<40-INFO"

Field conditions (data write)

  • Format: field operator value-level
  • Example: temp>30-WARN:status==ok-INFO
  • Supported operators: >, <, >=, <=, ==, !=

Supported aggregations

  • avg: Average value
  • min: Minimum value
  • max: Maximum value
  • count: Count of records
  • sum: Sum of values
  • derivative: Rate of change
  • median: Median value

Message template variables

Deadman notifications

  • $table: Measurement name
  • $time_from: Start of checked period
  • $time_to: End of checked period

Threshold notifications (scheduled)

  • $level: Alert severity level
  • $table: Measurement name
  • $field: Field name
  • $aggregation: Aggregation type
  • $op_sym: Operator symbol
  • $compare_val: Threshold value
  • $actual: Actual measured value
  • $row: Unique identifier

Threshold notifications (data write)

  • $level: Alert severity level
  • $field: Field name
  • $op_sym: Operator symbol
  • $compare_val: Threshold value
  • $actual: Actual field value

Row identification

The row variable uniquely identifies alert contexts using format: measurement:level:tag1=value1:tag2=value2

This ensures trigger counts are maintained independently for each unique combination of measurement, severity level, and tag values.

Report an issue

For plugin issues, see the Plugins repository issues page.

Find support for InfluxDB 3 Enterprise

The InfluxDB Discord server is the best place to find support for InfluxDB 3 Enterprise. For other InfluxDB versions, see the Support and feedback options.


Was this page helpful?

Thank you for your feedback!


The future of Flux

Flux is going into maintenance mode. You can continue using it as you currently are without any changes to your code.

Read more

New in InfluxDB 3.4

Key enhancements in InfluxDB 3.4 and the InfluxDB 3 Explorer 1.2.

See the Blog Post

InfluxDB 3.4 is now available for both Core and Enterprise, which introduces offline token generation for use in automated deployments and configurable license type selection that lets you bypass the interactive license prompt. InfluxDB 3 Explorer 1.2 is also available, which includes InfluxDB cache management and other new features.

For more information, check out: