AlertNode

Warning! This page documents an old version of Kapacitor, which is no longer actively developed. Kapacitor v1.3 is the most recent stable version of Kapacitor.

An AlertNode can trigger an event of varying severity levels, and pass the event to alert handlers. The criteria for triggering an alert is specified via a lambda expression. See AlertNode.Info, AlertNode.Warn, and AlertNode.Crit below.

Different event handlers can be configured for each AlertNode. Some handlers like Email, HipChat, Sensu, Slack, OpsGenie, VictorOps, PagerDuty and Talk have a configuration option 'global' that indicates that all alerts implicitly use the handler.

Available event handlers:

  • log – log alert data to file.
  • post – HTTP POST data to a specified URL.
  • email – Send and email with alert data.
  • exec – Execute a command passing alert data over STDIN.
  • HipChat – Post alert message to HipChat room.
  • Alerta – Post alert message to Alerta.
  • Sensu – Post alert message to Sensu client.
  • Slack – Post alert message to Slack channel.
  • OpsGenie – Send alert to OpsGenie.
  • VictorOps – Send alert to VictorOps.
  • PagerDuty – Send alert to PagerDuty.
  • Talk – Post alert message to Talk client.

See below for more details on configuring each handler.

Each event that gets sent to a handler contains the following alert data:

  • ID – the ID of the alert, user defined.
  • Message – the alert message, user defined.
  • Details – the alert details, user defined HTML content.
  • Time – the time the alert occurred.
  • Level – one of OK, INFO, WARNING or CRITICAL.
  • Data – influxql.Result containing the data that triggered the alert.

Events are sent to handlers if the alert is in a state other than 'OK' or the alert just changed to the 'OK' state from a non 'OK' state (a.k.a. the alert recovered). Using the AlertNode.StateChangesOnly property events will only be sent to handlers if the alert changed state.

It is valid to configure multiple alert handlers, even with the same type.

Example:

   stream
        .groupBy('service')
        .alert()
            .id('kapacitor/{{ index .Tags "service" }}')
            .message('{{ .ID }} is {{ .Level }} value:{{ index .Fields "value" }}')
            .info(lambda: "value" > 10)
            .warn(lambda: "value" > 20)
            .crit(lambda: "value" > 30)
            .post("http://example.com/api/alert")
            .post("http://another.example.com/api/alert")
            .email().to('oncall@example.com')

It is assumed that each successive level filters a subset of the previous level. As a result, the filter will only be applied if a data point passed the previous level. In the above example, if value = 15 then the INFO and WARNING expressions would be evaluated, but not the CRITICAL expression. Each expression maintains its own state.

Properties

Property methods modify state on the calling node. They do not add another node to the pipeline, and always return a reference to the calling node.

Alerta

Send the alert to Alerta.

Example:

    [alerta]
      enabled = true
      url = "https://alerta.yourdomain"
      token = "9hiWoDOZ9IbmHsOTeST123ABciWTIqXQVFDo63h9"
      environment = "Production"
      origin = "Kapacitor"

In order to not post a message every alert interval use AlertNode.StateChangesOnly so that only events where the alert changed state are sent to Alerta.

Send alerts to Alerta. The resource and event properties are required.

Example:

    stream...
         .alert()
             .alerta()
                 .resource('Hostname or service')
                 .event('Something went wrong')

Alerta also accepts optional alert information.

Example:

    stream...
         .alert()
             .alerta()
                 .resource('Hostname or service')
                 .event('Something went wrong')
                 .environment('Development')
                 .group('Dev. Servers')

NOTE: Alerta cannot be configured globally because of its required properties.

node.alerta()

Alerta Environment

Alerta environment. If empty uses the environment from the configuration.

node.alerta()
      .environment(value string)

Alerta Event

Alerta event. This is a required field.

node.alerta()
      .event(value string)

Alerta Group

Alerta group.

node.alerta()
      .group(value string)

Alerta Origin

Alerta origin. If empty uses the origin from the configuration.

node.alerta()
      .origin(value string)

Alerta Resource

Alerta resource. This is a required field.

node.alerta()
      .resource(value string)

Alerta Token

Alerta authentication token. If empty uses the token from the configuration.

node.alerta()
      .token(value string)

Alerta Value

Alerta value.

node.alerta()
      .value(value string)

Crit

Filter expression for the CRITICAL alert level. An empty value indicates the level is invalid and is skipped.

node.crit(value tick.Node)

Details

Template for constructing a detailed HTML message for the alert. The same template data is available as the AlertNode.Message property, in addition to a Message field that contains the rendered Message value.

The intent is that the Message property be a single line summary while the Details property is a more detailed message possibly spanning multiple lines, and containing HTML formatting.

This template is rendered using the html/template package in Go so that safe and valid HTML can be generated.

The json method is available within the template to convert any variable to a valid JSON string.

Example:

    .alert()
       .id('{{ .Name }}')
       .details('''
<h1>{{ .ID }}</h1>
<b>{{ .Message }}</b>
Value: {{ index .Fields "value" }}
''')
       .email()

Default: {{ json . }}

node.details(value string)

Email

Email the alert data.

If the To list is empty, the To addresses from the configuration are used. The email subject is the AlertNode.Message property. The email body is the AlertNode.Details property. The emails are sent as HTML emails and so the body can contain html markup.

If the 'smtp' section in the configuration has the option: global = true then all alerts are sent via email without the need to explicitly state it in the TICKscript.

Example:

    .alert()
       .id('{{ .Name }}')
       // Email subject
       .message('{{ .ID }}:{{ .Level }}')
       //Email body as HTML
       .details('''
<h1>{{ .ID }}</h1>
<b>{{ .Message }}</b>
Value: {{ index .Fields "value" }}
''')
       .email()

Send an email with custom subject and body.

Example:

     [smtp]
       enabled = true
       host = "localhost"
       port = 25
       username = ""
       password = ""
       from = "kapacitor@example.com"
       to = ["oncall@example.com"]
       # Set global to true so all alert trigger emails.
       global = true

Example:

    stream...
         .alert()

Send email to 'oncall@example.com' from 'kapacitor@example.com'

NOTE: The global option for email also implies stateChangesOnly is set on all alerts.

node.email(to ...string)

Exec

Execute a command whenever an alert is triggered and pass the alert data over STDIN in JSON format.

node.exec(executable string, args ...string)

Flapping

Perform flap detection on the alerts. The method used is similar method to Nagios: https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/3/en/flapping.html

Each different alerting level is considered a different state. The low and high thresholds are inverted thresholds of a percentage of state changes. Meaning that if the percentage of state changes goes above the high threshold, the alert enters a flapping state. The alert remains in the flapping state until the percentage of state changes goes below the low threshold. Typical values are low: 0.25 and high: 0.5. The percentage values represent the number state changes over the total possible number of state changes. A percentage change of 0.5 means that the alert changed state in half of the recorded history, and remained the same in the other half of the history.

node.flapping(low float64, high float64)

HipChat

If the 'hipchat' section in the configuration has the option: global = true then all alerts are sent to HipChat without the need to explicitly state it in the TICKscript.

Example:

    [hipchat]
      enabled = true
      url = "https://orgname.hipchat.com/v2/room"
      room = "Test Room"
      token = "9hiWoDOZ9IbmHsOTeST123ABciWTIqXQVFDo63h9"
      global = true

Example:

    stream...
         .alert()

Send alert to HipChat using default room 'Test Room'. NOTE: The global option for HipChat also implies stateChangesOnly is set on all alerts. Also, the room can either be the room id (numerical) or the room name.

node.hipChat()

HipChat Room

HipChat room in which to post messages. If empty uses the channel from the configuration.

node.hipChat()
      .room(value string)

HipChat Token

HipChat authentication token. If empty uses the token from the configuration.

node.hipChat()
      .token(value string)

History

Number of previous states to remember when computing flapping levels and checking for state changes. Minimum value is 2 in order to keep track of current and previous states.

Default: 21

node.history(value int64)

Id

Template for constructing a unique ID for a given alert.

Available template data:

  • Name – Measurement name.
  • TaskName – The name of the task
  • Group – Concatenation of all group-by tags of the form [key=value,]+. If no groupBy is performed equal to literal 'nil'.
  • Tags – Map of tags. Use '{{ index .Tags "key" }}' to get a specific tag value.

Example:

   stream.from().measurement('cpu')
   .groupBy('cpu')
   .alert()
      .id('kapacitor/{{ .Name }}/{{ .Group }}')

ID: kapacitor/cpu/cpu=cpu0,

Example:

   stream...
   .groupBy('service')
   .alert()
      .id('kapacitor/{{ index .Tags "service" }}')

ID: kapacitor/authentication

Example:

   stream...
   .groupBy('service', 'host')
   .alert()
      .id('kapacitor/{{ index .Tags "service" }}/{{ index .Tags "host" }}')

ID: kapacitor/authentication/auth001.example.com

Default: {{ .Name }}:{{ .Group }}

node.id(value string)

Info

Filter expression for the INFO alert level. An empty value indicates the level is invalid and is skipped.

node.info(value tick.Node)

Log

Log JSON alert data to file. One event per line. Must specify the absolute path to the log file. It will be created if it does not exist.

node.log(filepath string)

Message

Template for constructing a meaningful message for the alert.

Available template data:

  • ID – The ID of the alert.
  • Name – Measurement name.
  • TaskName – The name of the task
  • Group – Concatenation of all group-by tags of the form [key=value,]+. If no groupBy is performed equal to literal 'nil'.
  • Tags – Map of tags. Use '{{ index .Tags "key" }}' to get a specific tag value.
  • Level – Alert Level, one of: INFO, WARNING, CRITICAL.
  • Fields – Map of fields. Use '{{ index .Fields "key" }}' to get a specific field value.

Example:

   stream...
   .groupBy('service', 'host')
   .alert()
      .id('{{ index .Tags "service" }}/{{ index .Tags "host" }}')
      .message('{{ .ID }} is {{ .Level}} value: {{ index .Fields "value" }}')

Message: authentication/auth001.example.com is CRITICAL value:42

Default: {{ .ID }} is {{ .Level }}

node.message(value string)

OpsGenie

Send alert to OpsGenie. To use OpsGenie alerting you must first enable the 'Alert Ingestion API' in the 'Integrations' section of OpsGenie. Then place the API key from the URL into the 'opsgenie' section of the Kapacitor configuration.

Example:

    [opsgenie]
      enabled = true
      api-key = "xxxxx"
      teams = ["everyone"]
      recipients = ["jim", "bob"]

With the correct configuration you can now use OpsGenie in TICKscripts.

Example:

    stream...
         .alert()
             .opsGenie()

Send alerts to OpsGenie using the teams and recipients in the configuration file.

Example:

    stream...
         .alert()
             .opsGenie()
             .teams('team_rocket','team_test')

Send alerts to OpsGenie with team set to 'team_rocket' and 'team_test'

If the 'opsgenie' section in the configuration has the option: global = true then all alerts are sent to OpsGenie without the need to explicitly state it in the TICKscript.

Example:

    [opsgenie]
      enabled = true
      api-key = "xxxxx"
      recipients = ["johndoe"]
      global = true

Example:

    stream...
         .alert()

Send alert to OpsGenie using the default recipients, found in the configuration.

node.opsGenie()

OpsGenie Recipients

The list of recipients to be alerted. If empty defaults to the recipients from the configuration.

node.opsGenie()
      .recipients(recipients ...string)

OpsGenie Teams

The list of teams to be alerted. If empty defaults to the teams from the configuration.

node.opsGenie()
      .teams(teams ...string)

PagerDuty

Send the alert to PagerDuty. To use PagerDuty alerting you must first follow the steps to enable a new 'Generic API' service.

From https://developer.pagerduty.com/documentation/integration/events

  1. In your account, under the Services tab, click "Add New Service".
  2. Enter a name for the service and select an escalation policy. Then, select "Generic API" for the Service Type.
  3. Click the "Add Service" button.
  4. Once the service is created, you'll be taken to the service page. On this page, you'll see the "Service key", which is needed to access the API

Place the 'service key' into the 'pagerduty' section of the Kapacitor configuration as the option 'service-key'.

Example:

    [pagerduty]
      enabled = true
      service-key = "xxxxxxxxx"

With the correct configuration you can now use PagerDuty in TICKscripts.

Example:

    stream...
         .alert()
             .pagerDuty()

If the 'pagerduty' section in the configuration has the option: global = true then all alerts are sent to PagerDuty without the need to explicitly state it in the TICKscript.

Example:

    [pagerduty]
      enabled = true
      service-key = "xxxxxxxxx"
      global = true

Example:

    stream...
         .alert()

Send alert to PagerDuty.

node.pagerDuty()

Post

HTTP POST JSON alert data to a specified URL.

node.post(url string)

Sensu

Send the alert to Sensu.

Example:

    [sensu]
      enabled = true
      url = "http://sensu:3030"
      source = "Kapacitor"

Example:

    stream...
         .alert()
             .sensu()

Send alerts to Sensu client.

node.sensu()

Slack

Send the alert to Slack. To allow Kapacitor to post to Slack, go to the URL https://slack.com/services/new/incoming-webhook and create a new incoming webhook and place the generated URL in the 'slack' configuration section.

Example:

    [slack]
      enabled = true
      url = "https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxx"
      channel = "#general"

In order to not post a message every alert interval use AlertNode.StateChangesOnly so that only events where the alert changed state are posted to the channel.

Example:

    stream...
         .alert()
             .slack()

Send alerts to Slack channel in the configuration file.

Example:

    stream...
         .alert()
             .slack()
             .channel('#alerts')

Send alerts to Slack channel '#alerts'

Example:

    stream...
         .alert()
             .slack()
             .channel('@jsmith')

Send alert to user '@jsmith'

If the 'slack' section in the configuration has the option: global = true then all alerts are sent to Slack without the need to explicitly state it in the TICKscript.

Example:

    [slack]
      enabled = true
      url = "https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxx"
      channel = "#general"
      global = true

Example:

    stream...
         .alert()

Send alert to Slack using default channel '#general'. NOTE: The global option for Slack also implies stateChangesOnly is set on all alerts.

node.slack()

Slack Channel

Slack channel in which to post messages. If empty uses the channel from the configuration.

node.slack()
      .channel(value string)

StateChangesOnly

Only sends events where the state changed. Each different alert level OK, INFO, WARNING, and CRITICAL are considered different states.

Example:

    stream...
        .window()
             .period(10s)
             .every(10s)
        .alert()
            .crit(lambda: "value" > 10)
            .stateChangesOnly()
            .slack()

If the "value" is greater than 10 for a total of 60s, then only two events will be sent. First, when the value crosses the threshold, and second, when it falls back into an OK state. Without stateChangesOnly, the alert would have triggered 7 times: 6 times for each 10s period where the condition was met and once more for the recovery.

node.stateChangesOnly()

Talk

Send the alert to Talk. To use Talk alerting you must first follow the steps to create a new incoming webhook.

  1. Go to the URL https:/account.jianliao.com/signin.
  2. Sign in with you account. under the Team tab, click "Integrations".
  3. Select "Customize service", click incoming Webhook "Add" button.
  4. After choose the topic to connect with "xxx", click "Confirm Add" button.
  5. Once the service is created, you'll see the "Generate Webhook url".

Place the 'Generate Webhook url' into the 'Talk' section of the Kapacitor configuration as the option 'url'.

Example:

    [talk]
      enabled = true
      url = "https://jianliao.com/v2/services/webhook/uuid"
      author_name = "Kapacitor"

Example:

    stream...
         .alert()
             .talk()

Send alerts to Talk client.

node.talk()

VictorOps

Send alert to VictorOps. To use VictorOps alerting you must first enable the 'Alert Ingestion API' in the 'Integrations' section of VictorOps. Then place the API key from the URL into the 'victorops' section of the Kapacitor configuration.

Example:

    [victorops]
      enabled = true
      api-key = "xxxxx"
      routing-key = "everyone"

With the correct configuration you can now use VictorOps in TICKscripts.

Example:

    stream...
         .alert()
             .victorOps()

Send alerts to VictorOps using the routing key in the configuration file.

Example:

    stream...
         .alert()
             .victorOps()
             .routingKey('team_rocket')

Send alerts to VictorOps with routing key 'team_rocket'

If the 'victorops' section in the configuration has the option: global = true then all alerts are sent to VictorOps without the need to explicitly state it in the TICKscript.

Example:

    [victorops]
      enabled = true
      api-key = "xxxxx"
      routing-key = "everyone"
      global = true

Example:

    stream...
         .alert()

Send alert to VictorOps using the default routing key, found in the configuration.

node.victorOps()

VictorOps RoutingKey

The routing key to use for the alert. Defaults to the value in the configuration if empty.

node.victorOps()
      .routingKey(value string)

Warn

Filter expression for the WARNING alert level. An empty value indicates the level is invalid and is skipped.

node.warn(value tick.Node)

Chaining Methods

Chaining methods create a new node in the pipeline as a child of the calling node. They do not modify the calling node.

Deadman

Helper function for creating an alert on low throughput, aka deadman's switch.

  • Threshold – trigger alert if throughput drops below threshold in points/interval.
  • Interval – how often to check the throughput.
  • Expressions – optional list of expressions to also evaluate. Useful for time of day alerting.

Example:

    var data = stream.from()...
    // Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
    data.deadman(100.0, 10s)
    //Do normal processing of data
    data....

The above is equivalent to this Example:

    var data = stream.from()...
    // Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
    data.stats(10s)
          .derivative('collected')
              .unit(10s)
              .nonNegative()
          .alert()
              .id('node \'stream0\' in task \'{{ .TaskName }}\'')
              .message('{{ .ID }} is {{ if eq .Level "OK" }}alive{{ else }}dead{{ end }}: {{ index .Fields "collected" | printf "%0.3f" }} points/10s.')
              .crit(lambda: "collected" <= 100.0)
    //Do normal processing of data
    data....

The id and message alert properties can be configured globally via the 'deadman' configuration section.

Since the AlertNode is the last piece it can be further modified as normal. Example:

    var data = stream.from()...
    // Trigger critical alert if the throughput drops below 100 points per 1s and checked every 10s.
    data.deadman(100.0, 10s).slack().channel('#dead_tasks')
    //Do normal processing of data
    data....

You can specify additional lambda expressions to further constrain when the deadman's switch is triggered. Example:

    var data = stream.from()...
    // Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
    // Only trigger the alert if the time of day is between 8am-5pm.
    data.deadman(100.0, 10s, lambda: hour("time") >= 8 AND hour("time") <= 17)
    //Do normal processing of data
    data....
node.deadman(threshold float64, interval time.Duration, expr ...tick.Node)

Returns: AlertNode

Stats

Create a new stream of data that contains the internal statistics of the node. The interval represents how often to emit the statistics based on real time. This means the interval time is independent of the times of the data points the source node is receiving.

node.stats(interval time.Duration)

Returns: StatsNode