StreamNode

Warning! This page documents an old version of Kapacitor, which is no longer actively developed. Kapacitor v1.2 is the most recent stable version of Kapacitor.

A StreamNode represents the source of data being streamed to Kapacitor via any of its inputs. The stream node allows you to select which portion of the stream you want to process. The stream variable in stream tasks is an instance of a StreamNode.

Example:

    stream
        .from()
           .database('mydb')
           .retentionPolicy('myrp')
           .measurement('mymeasurement')
           .where(lambda: "host" =~ /logger\d+/)
        .window()
        ...

The above example selects only data points from the database mydb and retention policy myrp and measurement mymeasurement where the tag host matches the regex logger\d+

Properties

Property methods modify state on the calling node. They do not add another node to the pipeline, and always return a reference to the calling node.

Database

The database name. If empty any database will be used.

node.database(value string)

Measurement

The measurement name If empty any measurement will be used.

node.measurement(value string)

RetentionPolicy

The retention policy name If empty any retention policy will be used.

node.retentionPolicy(value string)

Truncate

Optional duration for truncating timestamps. Helpful to ensure data points land on specfic boundaries Example:

    stream
       .from().measurement('mydata')
           .truncate(1s)

All incoming data will be truncated to 1 second resolution.

node.truncate(value time.Duration)

Where

Filter the current stream using the given expression. This expression is a Kapacitor expression. Kapacitor expressions are a superset of InfluxQL WHERE expressions. See the expression docs for more information.

Multiple calls to the Where method will AND together each expression.

Example:

    stream
       .from()
          .where(lambda: condition1)
          .where(lambda: condition2)

The above is equivalent to this Example:

    stream
       .from()
          .where(lambda: condition1 AND condition2)

NOTE: Becareful to always use .from if you want multiple different streams.

Example:

  var data = stream.from().measurement('cpu')
  var total = data.where(lambda: "cpu" == 'cpu-total')
  var others = data.where(lambda: "cpu" != 'cpu-total')

The example above is equivalent to the example below, which is obviously not what was intended.

Example:

  var data = stream
              .from()
                  .measurement('cpu')
                  .where(lambda: "cpu" == 'cpu-total' AND "cpu" != 'cpu-total')
  var total = data
  var others = total

The example below will create two different streams each selecting a different subset of the original stream.

Example:

  var data = stream.from().measurement('cpu')
  var total = stream.from().measurement('cpu').where(lambda: "cpu" == 'cpu-total')
  var others = stream.from().measurement('cpu').where(lambda: "cpu" != 'cpu-total')

If empty then all data points are considered to match.

node.where(expression tick.Node)

Chaining Methods

Chaining methods create a new node in the pipeline as a child of the calling node. They do not modify the calling node.

Alert

Create an alert node, which can trigger alerts.

node.alert()

Returns: AlertNode

Deadman

Helper function for creating an alert on low throughput, aka deadman's switch.

  • Threshold – trigger alert if throughput drops below threshold in points/interval.
  • Interval – how often to check the throughput.
  • Expressions – optional list of expressions to also evaluate. Useful for time of day alerting.

Example:

    var data = stream.from()...
    // Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
    data.deadman(100.0, 10s)
    //Do normal processing of data
    data....

The above is equivalent to this Example:

    var data = stream.from()...
    // Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
    data.stats(10s)
          .derivative('collected')
              .unit(10s)
              .nonNegative()
          .alert()
              .id('node \'stream0\' in task \'{{ .TaskName }}\'')
              .message('{{ .ID }} is {{ if eq .Level "OK" }}alive{{ else }}dead{{ end }}: {{ index .Fields "collected" | printf "%0.3f" }} points/10s.')
              .crit(lamdba: "collected" <= 100.0)
    //Do normal processing of data
    data....

The id and message alert properties can be configured globally via the 'deadman' configuration section.

Since the AlertNode is the last piece it can be further modified as normal. Example:

    var data = stream.from()...
    // Trigger critical alert if the throughput drops below 100 points per 1s and checked every 10s.
    data.deadman(100.0, 10s).slack().channel('#dead_tasks')
    //Do normal processing of data
    data....

You can specify additional lambda expressions to further constrain when the deadman's switch is triggered. Example:

    var data = stream.from()...
    // Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
    // Only trigger the alert if the time of day is between 8am-5pm.
    data.deadman(100.0, 10s, lambda: hour("time") >= 8 AND hour("time") <= 17)
    //Do normal processing of data
    data....
node.deadman(threshold float64, interval time.Duration, expr ...tick.Node)

Returns: AlertNode

Derivative

Create a new node that computes the derivative of adjacent points.

node.derivative(field string)

Returns: DerivativeNode

Eval

Create an eval node that will evaluate the given transformation function to each data point. A list of expressions may be provided and will be evaluated in the order they are given and results of previous expressions are made available to later expressions.

node.eval(expressions ...tick.Node)

Returns: EvalNode

From

Creates a new stream node that can be further filtered using the Database, RetentionPolicy, Measurement and Where properties. From can be called multiple times to create multiple independent forks of the data stream.

Example:

    // Select the 'cpu' measurement from just the database 'mydb'
    // and retention policy 'myrp'.
    var cpu = stream.from()
                       .database('mydb')
                       .retentionPolicy('myrp')
                       .measurement('cpu')
    // Select the 'load' measurement from any database and retention policy.
    var load = stream.from()
                        .measurement('load')
    // Join cpu and load streams and do further processing.
    cpu.join(load)
            .as('cpu', 'load')
        ...
node.from()

Returns: StreamNode

GroupBy

Group the data by a set of tags.

Can pass literal * to group by all dimensions. Example:

    .groupBy(*)
node.groupBy(tag ...interface{})

Returns: StreamNode

HttpOut

Create an http output node that caches the most recent data it has received. The cached data is available at the given endpoint. The endpoint is the relative path from the API endpoint of the running task. For example if the task endpoint is at "/api/v1/task/<task_name>" and endpoint is "top10", then the data can be requested from "/api/v1/task/<task_name>/top10".

node.httpOut(endpoint string)

Returns: HTTPOutNode

InfluxDBOut

Create an influxdb output node that will store the incoming data into InfluxDB.

node.influxDBOut()

Returns: InfluxDBOutNode

Join

Join this node with other nodes. The data is joined on timestamp.

node.join(others ...Node)

Returns: JoinNode

MapReduce

Perform a map-reduce operation on the data. The built-in functions under influxql provide the selection,aggregation, and transformation functions from the InfluxQL language.

MapReduce may be applied to either a batch or a stream edge. In the case of a batch each batch is passed to the mapper independently. In the case of a stream all incoming data points that have the exact same time are combined into a batch and sent to the mapper.

node.mapReduce(mr MapReduceInfo)

Returns: ReduceNode

Sample

Create a new node that samples the incoming points or batches.

One point will be emitted every count or duration specified.

node.sample(rate interface{})

Returns: SampleNode

Stats

Create a new stream of data that contains the internal statistics of the node. The interval represents how often to emit the statistics based on real time. This means the interval time is independent of the times of the data points the source node is receiving.

node.stats(interval time.Duration)

Returns: StatsNode

Union

Perform the union of this node and all other given nodes.

node.union(node ...Node)

Returns: UnionNode

Window

Create a new node that windows the stream by time.

NOTE: Window can only be applied to stream edges.

node.window()

Returns: WindowNode