BatchNode

Warning! This page documents an old version of Kapacitor, which is no longer actively developed. Kapacitor v1.3 is the most recent stable version of Kapacitor.

A BatchNode defines a source and a schedule for processing batch data. The data is queried from an InfluxDB database and then passed into the data pipeline.

Example:

 batch
     |query('''
         SELECT mean("value")
         FROM "telegraf"."default".cpu_usage_idle
         WHERE "host" = 'serverA'
     ''')
         .period(1m)
         .every(20s)
         .groupBy(time(10s), 'cpu')
     ...

In the above example InfluxDB is queried every 20 seconds; the window of time returned spans 1 minute and is grouped into 10 second buckets.

Index

Properties

Chaining Methods

Properties

Property methods modify state on the calling node. They do not add another node to the pipeline, and always return a reference to the calling node. Property methods are marked using the . operator.

Align

Align start and stop times for quiries with even boundaries of the BatchNode.Every property. Does not apply if using the BatchNode.Cron property.

node.align()

Cluster

The name of a configured InfluxDB cluster. If empty the default cluster will be used.

node.cluster(value string)

Cron

Define a schedule using a cron syntax.

The specific cron implementation is documented here: https://github.com/gorhill/cronexpr#implementation

The Cron property is mutually exclusive with the Every property.

node.cron(value string)

Every

How often to query InfluxDB.

The Every property is mutually exclusive with the Cron property.

node.every(value time.Duration)

Fill

Fill the data. Options are:

  • Any numerical value
  • null - exhibits the same behavior as the default
  • previous - reports the value of the previous window
  • none - suppresses timestamps and values where the value is null
node.fill(value interface{})

GroupBy

Group the data by a set of dimensions. Can specify one time dimension.

This property adds a GROUP BY clause to the query so all the normal behaviors when quering InfluxDB with a GROUP BY apply. More details: https://influxdb.com/docs/v0.9/query_language/data_exploration.html#the-group-by-clause

Example:

    batch
        |query(...)
            .groupBy(time(10s), 'tag1', 'tag2'))
node.groupBy(d ...interface{})

Offset

How far back in time to query from the current time

For example an Offest of 2 hours and an Every of 5m, Kapacitor will query InfluxDB every 5 minutes for the window of data 2 hours ago.

This applies to Cron schedules as well. If the cron specifies to run every Sunday at 1 AM and the Offset is 1 hour. Then at 1 AM on Sunday the data from 12 AM will be queried.

node.offset(value time.Duration)

Period

The period or length of time that will be queried from InfluxDB

node.period(value time.Duration)

Chaining Methods

Chaining methods create a new node in the pipeline as a child of the calling node. They do not modify the calling node. Chaining methods are marked using the | operator.

Alert

Create an alert node, which can trigger alerts.

node|alert()

Returns: AlertNode

Bottom

Select the bottom num points for field and sort by any extra tags or fields.

node|bottom(num int64, field string, fieldsAndTags ...string)

Returns: InfluxQLNode

Count

Count the number of points.

node|count(field string)

Returns: InfluxQLNode

Deadman

Helper function for creating an alert on low throughput, aka deadman's switch.

  • Threshold – trigger alert if throughput drops below threshold in points/interval.
  • Interval – how often to check the throughput.
  • Expressions – optional list of expressions to also evaluate. Useful for time of day alerting.

Example:

    var data = stream
        |from()...
    // Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
    data
        |deadman(100.0, 10s)
    //Do normal processing of data
    data...

The above is equivalent to this Example:

    var data = stream
        |from()...
    // Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
    data
        |stats(10s)
        |derivative('emitted')
            .unit(10s)
            .nonNegative()
        |alert()
            .id('node \'stream0\' in task \'{{ .TaskName }}\'')
            .message('{{ .ID }} is {{ if eq .Level "OK" }}alive{{ else }}dead{{ end }}: {{ index .Fields "emitted" | printf "%0.3f" }} points/10s.')
            .crit(lamdba: "emitted" <= 100.0)
    //Do normal processing of data
    data...

The id and message alert properties can be configured globally via the 'deadman' configuration section.

Since the AlertNode is the last piece it can be further modified as normal. Example:

    var data = stream
        |from()...
    // Trigger critical alert if the throughput drops below 100 points per 1s and checked every 10s.
    data
        |deadman(100.0, 10s)
            .slack()
            .channel('#dead_tasks')
    //Do normal processing of data
    data...

You can specify additional lambda expressions to further constrain when the deadman's switch is triggered. Example:

    var data = stream
        |from()...
    // Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
    // Only trigger the alert if the time of day is between 8am-5pm.
    data
        |deadman(100.0, 10s, lambda: hour("time") >= 8 AND hour("time") <= 17)
    //Do normal processing of data
    data...
node|deadman(threshold float64, interval time.Duration, expr ...tick.Node)

Returns: AlertNode

Derivative

Create a new node that computes the derivative of adjacent points.

node|derivative(field string)

Returns: DerivativeNode

Distinct

Produce batch of only the distinct points.

node|distinct(field string)

Returns: InfluxQLNode

Eval

Create an eval node that will evaluate the given transformation function to each data point. A list of expressions may be provided and will be evaluated in the order they are given and results of previous expressions are made available to later expressions.

node|eval(expressions ...tick.Node)

Returns: EvalNode

First

Select the first point.

node|first(field string)

Returns: InfluxQLNode

HttpOut

Create an http output node that caches the most recent data it has received. The cached data is available at the given endpoint. The endpoint is the relative path from the API endpoint of the running task. For example if the task endpoint is at "/api/v1/task/<task_name>" and endpoint is "top10", then the data can be requested from "/api/v1/task/<task_name>/top10".

node|httpOut(endpoint string)

Returns: HTTPOutNode

InfluxDBOut

Create an influxdb output node that will store the incoming data into InfluxDB.

node|influxDBOut()

Returns: InfluxDBOutNode

Join

Join this node with other nodes. The data is joined on timestamp.

node|join(others ...Node)

Returns: JoinNode

Last

Select the last point.

node|last(field string)

Returns: InfluxQLNode

Log

Create a node that logs all data it receives.

node|log()

Returns: LogNode

Max

Select the maximum point.

node|max(field string)

Returns: InfluxQLNode

Mean

Compute the mean of the data.

node|mean(field string)

Returns: InfluxQLNode

Median

Compute the median of the data. Note, this method is not a selector, if you want the median point use .percentile(field, 50.0).

node|median(field string)

Returns: InfluxQLNode

Min

Select the minimum point.

node|min(field string)

Returns: InfluxQLNode

Percentile

Select a point at the given percentile. This is a selector function, no interpolation between points is performed.

node|percentile(field string, percentile float64)

Returns: InfluxQLNode

Sample

Create a new node that samples the incoming points or batches.

One point will be emitted every count or duration specified.

node|sample(rate interface{})

Returns: SampleNode

Shift

Create a new node that shifts the incoming points or batches in time.

node|shift(shift time.Duration)

Returns: ShiftNode

Spread

Compute the difference between min and max points.

node|spread(field string)

Returns: InfluxQLNode

Stats

Create a new stream of data that contains the internal statistics of the node. The interval represents how often to emit the statistics based on real time. This means the interval time is independent of the times of the data points the source node is receiving.

node|stats(interval time.Duration)

Returns: StatsNode

Stddev

Compute the standard deviation.

node|stddev(field string)

Returns: InfluxQLNode

Sum

Compute the sum of all values.

node|sum(field string)

Returns: InfluxQLNode

Top

Select the top num points for field and sort by any extra tags or fields.

node|top(num int64, field string, fieldsAndTags ...string)

Returns: InfluxQLNode

Union

Perform the union of this node and all other given nodes.

node|union(node ...Node)

Returns: UnionNode

Where

Create a new node that filters the data stream by a given expression.

node|where(expression tick.Node)

Returns: WhereNode

Window

Create a new node that windows the stream by time.

NOTE: Window can only be applied to stream edges.

node|window()

Returns: WindowNode