Heartbeat Output Plugin
This plugin sends a heartbeat signal via POST to a HTTP endpoint on a regular interval. This is useful to keep track of existing Telegraf instances in a large deployment.
Introduced in: Telegraf v1.37.0 Tags: applications OS support: all
Global configuration options
Plugins support additional global and plugin configuration settings for tasks such as modifying metrics, tags, and fields, creating aliases, and configuring plugin ordering. See CONFIGURATION.md for more details.
Secret-store support
This plugin supports secrets from secret-stores for the url, token and
headers option.
See the secret-store documentation for more details on how
to use them.
Configuration
# A plugin that can transmit heartbeats over HTTP
[[outputs.heartbeat]]
## URL of heartbeat endpoint
url = "http://monitoring.example.com/heartbeat"
## Unique identifier to submit for the Telegraf instance (required)
instance_id = "agent-123"
## Token for bearer authentication
# token = ""
## Interval for sending heartbeat messages
# interval = "1m"
## Information to include in the message, available options are
## hostname -- hostname of the instance running Telegraf
## statistics -- number of metrics, logged errors and warnings, etc
## configs -- redacted list of configs loaded by this instance
## logs -- detailed log-entries for this instance
## status -- result of the status condition evaluation
# include = ["hostname"]
## Logging information filtering, only applies if "logs" is added to "include"
# [outputs.heartbeat.logs]
# ## Number of log entries to send (unlimited by default)
# ## In case more log-entries are available entries with higher log levels
# ## and more recent entries are preferred.
# # limit = 0
#
# ## Minimum log-level for sending the entry
# # level = "error"
## Logical conditions to determine the agent status, only applies if "status"
## is included in the message
# [outputs.heartbeat.status]
# ## Conditions to signal the given status as CEL programs returning a
# ## boolean. Conditions are evaluated in the order below until a program
# ## evaluates to "true".
# # ok = "false"
# # warn = "false"
# # fail = "false"
#
# ## Evaluation order of the conditions above; available: "ok", "warn", "fail"
# # order = ["ok", "warn", "fail"]
#
# ## Default status used if none of the conditions above matches
# ## available: "ok", "warn", "fail", "undefined"
# # default = "ok"
#
# ## If set, send this initial status before the first write, otherwise
# ## compute the status from the conditions and default above.
# ## available: "ok", "warn", "fail", "undefined", ""
# # initial = ""
## Additional HTTP headers
# [outputs.heartbeat.headers]
# User-Agent = "telegraf"Each heartbeat message, sent every interval, contains at least the specified
Telegraf instance_id, the Telegraf version and the version of the JSON-Schema
used for the message. The latest schema can be found in the
plugin directory.
Additional information can be included in the message via the include setting.
Some information, e.g. the number of metrics, is only updated after the first flush cycle, this must be considered when interpreting the messages.
Statistics included in heartbeat messages are accumulated since the last
successful heartbeat. If a heartbeat cannot be sent, accumulation of data
continues until the next successful send. Additionally, message after a failed
send the last field contains the Unix timestamp of the last successful
heartbeat, allowing you to identify gaps in reporting and to calculate rates.
Configuration information
When including configs in the message, the heartbeat message will contain the
configuration sources used to setup the currently running Telegraf instance.
As the configuration sources contains the path or the URL, the resulting heartbeat messages may be large. Use this option with care if network traffic is a limiting factor!
The configuration information can potentially change when watching e.g. the configuration directory while a new configuration is added or removed.
Configuration URLs are redacted to remove the username and password information. However, sensitive information might still be contained in the URL or the path sent. Use with care!
Logging information
When including logs in the message the actual log messages are included.
This comprises the log messages of all plugins and the agent itself being
logged after the Connect function of this plugin was called, i.e. you will
not see any initialization or configuration errors in the heartbeat messages!
You can limit the messages sent within the optional outputs.heartbeat.logs
section where you can limit the messages by log-level or limit the number
of messages included using the limit setting.
As the amount of log messages can be high, especially when configuring a low
level such as info the resulting heartbeat messages might be large. Restrict
the included messages by choosing a higher log-level and/or by using a limit!
When including logs in the message the number of errors and warnings logged
in this Telegraf instance are included in the heartbeat message. This comprises
all log messages of all plugins and the agent itself logged after the
Connect function of this plugin was called, i.e. you will not see any
initialization or configuration errors in the heartbeat messages!
For getting the actual log messages you can include log-details. Via the
optional outputs.heartbeat.status you can limit the messages by log-level
or limit the number included using the limit setting.
As the amount of log messages can be high, especially when configuring low
level such as info the resulting heartbeat messages might be large. Use the
log-details option with care if network traffic is a limiting factor and
restrict the included messages to high levels and use a limit!
When setting the level option only messages with this or more severe levels
are included.
The limit setting allows to specify the maximum number of log-messages
included in the heartbeat message. If the number of log-messages exceeds the
given limit they are selected by the most severe level and most recent messages
first.
given limit they are selected by most severe and most recent messages first.
Status information
By including status the message will contain the status of the Telegraf
instance as configured via the outputs.heartbeat.status section.
This section allows to set an initial state used as long as no flush was
performed by Telegraf. If initial is not configured or empty, the status
expressions are evaluated also before the first flush.
The ok, warn and fail settings allow to specify CEL expressions
evaluating to a boolean value. Available information for the expressions are
listed below. The first expression evaluating to true defines the status.
The order parameter allows to customize the evaluation order.
If an expression is omitted in the order setting it will not be
evaluated!
The status defined via default is used in case none of the status expressions
evaluate to true.
For defining expressions you can use the following variables
metrics(int) – number of metrics arriving at this pluginlog_errors(int) – number of errors loggedlog_warnings(int) – number of warnings loggedlast_update(time) – time of last successful heartbeat message, can be used to e.g. calculate ratesagent(map) – agent statistics, see belowinputs(map) – input plugin statistics, see belowoutputs(map) – output plugin statistics, see below
The agent statistics variable is a map with information matching the
internal_agent metric of the internal input plugin:
metrics_written(int) – number of metrics written in total by all outputsmetrics_rejected(int) – number of metrics rejected in total by all outputsmetrics_dropped(int) – number of metrics dropped in total by all outputsmetrics_gathered(int) – number of metrics collected in total by all inputsgather_errors(int) – number of errors during collection by all inputsgather_timeouts(int) – number of collection timeouts by all inputs
The inputs statistics variable is a map with the key denoting the plugin
type (e.g. cpu for inputs.cpu) and the value being list of plugin
statistics. Each entry in the list corresponds to an input plugin instance with
information matching the internal_gather metric of the
internal input plugin:
id(string) – unique plugin identifieralias(string) – alias set for the plugin; only exists if alias is definederrors(int) – collection errors for this plugin instancemetrics_gathered(int) – number of metrics collectedgather_time_ns(int) – time used to gather the metrics in nanosecondsgather_timeouts(int) – number of timeouts during metric collectionstartup_errors(int) – number of times the plugin failed to start
The outputs statistics variable is a map with the key denoting the plugin
type (e.g. influxdb for outputs.influxdb) and the value being list of plugin
statistics. Each entry in the list corresponds to an output plugin instance with
information matching the internal_write metric of the
internal input plugin:
id(string) – unique plugin identifieralias(string) – alias set for the plugin; only exists if alias is definederrors(int) – write errors for this plugin instancemetrics_filtered(int) – number of metrics filtered by the outputwrite_time_ns(int) – time used to write the metrics in nanosecondsstartup_errors(int) – number of times the plugin failed to startmetrics_added(int) – number of metrics added to the output buffermetrics_written(int) – number of metrics written to the outputmetrics_rejected(int) – number of metrics rejected by the service or serializationmetrics_dropped(int) – number of metrics dropped e.g. due to buffer fullnessbuffer_size(int) – current number of metrics currently in the output buffer for the plugin instancebuffer_limit(int) – capacity of the output buffer; irrelevant for disk-based buffersbuffer_fullness(float) – current ratio of metrics in the buffer to capacity; can be greater than one (i.e.> 100%) for disk-based buffers
If not stated otherwise, all variables are accumulated since the last successful heartbeat message.
The following functions are available:
encodingfunctions of the CEL encoder librarymathfunctions of the CEL math librarystringfunctions of the CEL strings librarynowfunction for getting the current time
Was this page helpful?
Thank you for your feedback!
Support and feedback
Thank you for being part of our community! We welcome and encourage your feedback and bug reports for Telegraf and this documentation. To find support, use the following resources:
Customers with an annual or support contract can contact InfluxData Support.