AlertNode

Warning! This page documents an old version of Kapacitor, which is no longer actively developed. Kapacitor v1.3 is the most recent stable version of Kapacitor.

An AlertNode can trigger an event of varying severity levels, and pass the event to alert handlers. The criteria for triggering an alert is specified via a lambda expression. See AlertNode.Info, AlertNode.Warn, and AlertNode.Crit below.

Different event handlers can be configured for each AlertNode. Some handlers like Email, HipChat, Slack, OpsGenie, VictorOps and PagerDuty have a configuration option 'global' that indicates that all alerts implicitly use the handler.

Available event handlers:

  • log – log alert data to file.
  • post – HTTP POST data to a specified URL.
  • email – Send and email with alert data.
  • exec – Execute a command passing alert data over STDIN.
  • HipChat – Post alert message to HipChat room.
  • Alerta – Post alert message to Alerta.
  • Slack – Post alert message to Slack channel.
  • OpsGenie – Send alert to OpsGenie.
  • VictorOps – Send alert to VictorOps.
  • PagerDuty – Send alert to PagerDuty.

See below for more details on configuring each handler.

Each event that gets sent to a handler contains the following alert data:

  • ID – the ID of the alert, user defined.
  • Message – the alert message, user defined.
  • Time – the time the alert occurred.
  • Level – one of OK, INFO, WARNING or CRITICAL.
  • Data – influxql.Result containing the data that triggered the alert.

Events are sent to handlers if the alert is in a state other than 'OK' or the alert just changed to the 'OK' state from a non 'OK' state (a.k.a. the alert recovered). Using the AlertNode.StateChangesOnly property events will only be sent to handlers if the alert changed state.

It is valid to configure multiple alert handlers, even with the same type.

Example:

   stream
        .groupBy('service')
        .alert()
            .id('kapacitor/{{ index .Tags "service" }}')
            .message('{{ .ID }} is {{ .Level }} value:{{ index .Fields "value" }}')
            .info(lambda: "value" > 10)
            .warn(lambda: "value" > 20)
            .crit(lambda: "value" > 30)
            .post("http://example.com/api/alert")
            .post("http://another.example.com/api/alert")
            .email('oncall@example.com')

It is assumed that each successive level filters a subset of the previous level. As a result, the filter will only be applied if a data point passed the previous level. In the above example, if value = 15 then the INFO and WARNING expressions would be evaluated, but not the CRITICAL expression. Each expression maintains its own state.

Properties

Property methods modify state on the calling node. They do not add another node to the pipeline, and always return a reference to the calling node.

Alerta

Send the alert to Alerta.

Example:

    [alerta]
      enabled = true
      url = "https://alerta.yourdomain"
      token = "9hiWoDOZ9IbmHsOTeST123ABciWTIqXQVFDo63h9"
      environment = "Production"
      origin = "Kapacitor"

In order to not post a message every alert interval use AlertNode.StateChangesOnly so that only events where the alert changed state are sent to Alerta.

Send alerts to Alerta. The resource and event properties are required.

Example:

    stream...
         .alert()
             .alerta()
                 .resource('Hostname or service')
                 .event('Something went wrong')

Alerta also accepts optional alert information.

Example:

    stream...
         .alert()
             .alerta()
                 .resource('Hostname or service')
                 .event('Something went wrong')
                 .environment('Development')
                 .group('Dev. Servers')

NOTE: Alerta cannot be configured globally because of its required properties.

node.alerta()

Alerta Environment

Alerta environment. If empty uses the environment from the configuration.

node.alerta()
      .environment(value string)

Alerta Event

Alerta event. This is a required field.

node.alerta()
      .event(value string)

Alerta Group

Alerta group.

node.alerta()
      .group(value string)

Alerta Origin

Alerta origin. If empty uses the origin from the configuration.

node.alerta()
      .origin(value string)

Alerta Resource

Alerta resource. This is a required field.

node.alerta()
      .resource(value string)

Alerta Token

Alerta authentication token. If empty uses the token from the configuration.

node.alerta()
      .token(value string)

Alerta Value

Alerta value.

node.alerta()
      .value(value string)

Crit

Filter expression for the CRITICAL alert level. An empty value indicates the level is invalid and is skipped.

node.crit(value tick.Node)

Email

Email the alert data.

If the To list is empty, the To addresses from the configuration are used. The email subject is the AlertNode.Message property. The email body is the JSON alert data.

If the 'smtp' section in the configuration has the option: global = true then all alerts are sent via email without the need to explicitly state it in the TICKscript.

Example:

     [smtp]
       enabled = true
       host = "localhost"
       port = 25
       username = ""
       password = ""
       from = "kapacitor@example.com"
       to = ["oncall@example.com"]
       # Set global to true so all alert trigger emails.
       global = true

Example:

    stream...
         .alert()

Send email to 'oncall@example.com' from 'kapacitor@example.com'

NOTE: The global option for email also implies stateChangesOnly is set on all alerts.

node.email(to ...string)

Exec

Execute a command whenever an alert is triggered and pass the alert data over STDIN in JSON format.

node.exec(executable string, args ...string)

Flapping

Perform flap detection on the alerts. The method used is similar method to Nagios: https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/3/en/flapping.html

Each different alerting level is considered a different state. The low and high thresholds are inverted thresholds of a percentage of state changes. Meaning that if the percentage of state changes goes above the high threshold, the alert enters a flapping state. The alert remains in the flapping state until the percentage of state changes goes below the low threshold. Typical values are low: 0.25 and high: 0.5. The percentage values represent the number state changes over the total possible number of state changes. A percentage change of 0.5 means that the alert changed state in half of the recorded history, and remained the same in the other half of the history.

node.flapping(low float64, high float64)

HipChat

If the 'hipchat' section in the configuration has the option: global = true then all alerts are sent to HipChat without the need to explicitly state it in the TICKscript.

Example:

    [hipchat]
      enabled = true
      url = "https://orgname.hipchat.com/v2/room"
      room = "Test Room"
      token = "9hiWoDOZ9IbmHsOTeST123ABciWTIqXQVFDo63h9"
      global = true

Example:

    stream...
         .alert()

Send alert to HipChat using default room 'Test Room'. NOTE: The global option for HipChat also implies stateChangesOnly is set on all alerts. Also, the room can either be the room id (numerical) or the room name.

node.hipChat()

HipChat Room

HipChat room in which to post messages. If empty uses the channel from the configuration.

node.hipChat()
      .room(value string)

HipChat Token

HipChat authentication token. If empty uses the token from the configuration.

node.hipChat()
      .token(value string)

History

Number of previous states to remember when computing flapping levels and checking for state changes. Minimum value is 2 in order to keep track of current and previous states.

Default: 21

node.history(value int64)

Id

Template for constructing a unique ID for a given alert.

Available template data:

  • Name – Measurement name.
  • Group – Concatenation of all group-by tags of the form [key=value,]+. If no groupBy is performed equal to literal 'nil'.
  • Tags – Map of tags. Use '{{ index .Tags "key" }}' to get a specific tag value.

Example:

   stream.from().measurement('cpu')
   .groupBy('cpu')
   .alert()
      .id('kapacitor/{{ .Name }}/{{ .Group }}')

ID: kapacitor/cpu/cpu=cpu0,

Example:

   stream...
   .groupBy('service')
   .alert()
      .id('kapacitor/{{ index .Tags "service" }}')

ID: kapacitor/authentication

Example:

   stream...
   .groupBy('service', 'host')
   .alert()
      .id('kapacitor/{{ index .Tags "service" }}/{{ index .Tags "host" }}')

ID: kapacitor/authentication/auth001.example.com

Default: {{ .Name }}:{{ .Group }}

node.id(value string)

Info

Filter expression for the INFO alert level. An empty value indicates the level is invalid and is skipped.

node.info(value tick.Node)

Log

Log JSON alert data to file. One event per line. Must specify the absolute path to the log file. It will be created if it does not exist.

node.log(filepath string)

Message

Template for constructing a meaningful message for the alert.

Available template data:

  • ID – The ID of the alert.
  • Name – Measurement name.
  • Group – Concatenation of all group-by tags of the form [key=value,]+. If no groupBy is performed equal to literal 'nil'.
  • Tags – Map of tags. Use '{{ index .Tags "key" }}' to get a specific tag value.
  • Level – Alert Level, one of: INFO, WARNING, CRITICAL.
  • Fields – Map of fields. Use '{{ index .Fields "key" }}' to get a specific field value.

Example:

   stream...
   .groupBy('service', 'host')
   .alert()
      .id('{{ index .Tags "service" }}/{{ index .Tags "host" }}')
      .message('{{ .ID }} is {{ .Level}} value: {{ index .Fields "value" }}')

Message: authentication/auth001.example.com is CRITICAL value:42

Default: {{ .ID }} is {{ .Level }}

node.message(value string)

OpsGenie

Send alert to OpsGenie. To use OpsGenie alerting you must first enable the 'Alert Ingestion API' in the 'Integrations' section of OpsGenie. Then place the API key from the URL into the 'opsgenie' section of the Kapacitor configuration.

Example:

    [opsgenie]
      enabled = true
      api-key = "xxxxx"
      teams = ["everyone"]
      recipients = ["jim", "bob"]

With the correct configuration you can now use OpsGenie in TICKscripts.

Example:

    stream...
         .alert()
             .opsGenie()

Send alerts to OpsGenie using the teams and recipients in the configuration file.

Example:

    stream...
         .alert()
             .opsGenie()
             .teams('team_rocket','team_test')

Send alerts to OpsGenie with team set to 'team_rocket' and 'team_test'

If the 'opsgenie' section in the configuration has the option: global = true then all alerts are sent to OpsGenie without the need to explicitly state it in the TICKscript.

Example:

    [opsgenie]
      enabled = true
      api-key = "xxxxx"
      recipients = ["johndoe"]
      global = true

Example:

    stream...
         .alert()

Send alert to OpsGenie using the default recipients, found in the configuration.

node.opsGenie()

OpsGenie Recipients

The list of recipients to be alerted. If empty defaults to the recipients from the configuration.

node.opsGenie()
      .recipients(recipients ...string)

OpsGenie Teams

The list of teams to be alerted. If empty defaults to the teams from the configuration.

node.opsGenie()
      .teams(teams ...string)

PagerDuty

Send the alert to PagerDuty. To use PagerDuty alerting you must first follow the steps to enable a new 'Generic API' service.

From https://developer.pagerduty.com/documentation/integration/events

  1. In your account, under the Services tab, click "Add New Service".
  2. Enter a name for the service and select an escalation policy. Then, select "Generic API" for the Service Type.
  3. Click the "Add Service" button.
  4. Once the service is created, you'll be taken to the service page. On this page, you'll see the "Service key", which is needed to access the API

Place the 'service key' into the 'pagerduty' section of the Kapacitor configuration as the option 'service-key'.

Example:

    [pagerduty]
      enabled = true
      service-key = "xxxxxxxxx"

With the correct configuration you can now use PagerDuty in TICKscripts.

Example:

    stream...
         .alert()
             .pagerDuty()

If the 'pagerduty' section in the configuration has the option: global = true then all alerts are sent to PagerDuty without the need to explicitly state it in the TICKscript.

Example:

    [pagerduty]
      enabled = true
      service-key = "xxxxxxxxx"
      global = true

Example:

    stream...
         .alert()

Send alert to PagerDuty.

node.pagerDuty()

Post

HTTP POST JSON alert data to a specified URL.

node.post(url string)

Slack

Send the alert to Slack. To allow Kapacitor to post to Slack, go to the URL https://slack.com/services/new/incoming-webhook and create a new incoming webhook and place the generated URL in the 'slack' configuration section.

Example:

    [slack]
      enabled = true
      url = "https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxx"
      channel = "#general"

In order to not post a message every alert interval use AlertNode.StateChangesOnly so that only events where the alert changed state are posted to the channel.

Example:

    stream...
         .alert()
             .slack()

Send alerts to Slack channel in the configuration file.

Example:

    stream...
         .alert()
             .slack()
             .channel('#alerts')

Send alerts to Slack channel '#alerts'

Example:

    stream...
         .alert()
             .slack()
             .channel('@jsmith')

Send alert to user '@jsmith'

If the 'slack' section in the configuration has the option: global = true then all alerts are sent to Slack without the need to explicitly state it in the TICKscript.

Example:

    [slack]
      enabled = true
      url = "https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxx"
      channel = "#general"
      global = true

Example:

    stream...
         .alert()

Send alert to Slack using default channel '#general'. NOTE: The global option for Slack also implies stateChangesOnly is set on all alerts.

node.slack()

Slack Channel

Slack channel in which to post messages. If empty uses the channel from the configuration.

node.slack()
      .channel(value string)

StateChangesOnly

Only sends events where the state changed. Each different alert level OK, INFO, WARNING, and CRITICAL are considered different states.

Example:

    stream...
        .window()
             .period(10s)
             .every(10s)
        .alert()
            .crit(lambda: "value" > 10)
            .stateChangesOnly()
            .slack()

If the "value" is greater than 10 for a total of 60s, then only two events will be sent. First, when the value crosses the threshold, and second, when it falls back into an OK state. Without stateChangesOnly, the alert would have triggered 7 times: 6 times for each 10s period where the condition was met and once more for the recovery.

node.stateChangesOnly()

VictorOps

Send alert to VictorOps. To use VictorOps alerting you must first enable the 'Alert Ingestion API' in the 'Integrations' section of VictorOps. Then place the API key from the URL into the 'victorops' section of the Kapacitor configuration.

Example:

    [victorops]
      enabled = true
      api-key = "xxxxx"
      routing-key = "everyone"

With the correct configuration you can now use VictorOps in TICKscripts.

Example:

    stream...
         .alert()
             .victorOps()

Send alerts to VictorOps using the routing key in the configuration file.

Example:

    stream...
         .alert()
             .victorOps()
             .routingKey('team_rocket')

Send alerts to VictorOps with routing key 'team_rocket'

If the 'victorops' section in the configuration has the option: global = true then all alerts are sent to VictorOps without the need to explicitly state it in the TICKscript.

Example:

    [victorops]
      enabled = true
      api-key = "xxxxx"
      routing-key = "everyone"
      global = true

Example:

    stream...
         .alert()

Send alert to VictorOps using the default routing key, found in the configuration.

node.victorOps()

VictorOps RoutingKey

The routing key to use for the alert. Defaults to the value in the configuration if empty.

node.victorOps()
      .routingKey(value string)

Warn

Filter expression for the WARNING alert level. An empty value indicates the level is invalid and is skipped.

node.warn(value tick.Node)