A BatchNode defines a source and a schedule for processing batch data. The data is queried from an InfluxDB database and then passed into the data pipeline.
batch .query(''' SELECT mean("value") FROM "telegraf"."default".cpu_usage_idle WHERE "host" = 'serverA' ''') .period(1m) .every(20s) .groupBy(time(10s), 'cpu') ...
In the above example InfluxDB is queried every 20 seconds; the window of time returned spans 1 minute and is grouped into 10 second buckets.
Property methods modify state on the calling node. They do not add another node to the pipeline, and always return a reference to the calling node.
Define a schedule using a cron syntax.
The specific cron implementation is documented here: https://github.com/gorhill/cronexpr#implementation
The Cron property is mutually exclusive with the Every property.
How often to query InfluxDB.
The Every property is mutually exclusive with the Cron property.
Fill the data. Options are:
- Any numerical value
- null - exhibits the same behavior as the default
- previous - reports the value of the previous window
- none - suppresses timestamps and values where the value is null
Group the data by a set of dimensions. Can specify one time dimension.
This property adds a
GROUP BY clause to the query
so all the normal behaviors when quering InfluxDB with a
GROUP BY apply.
More details: https://influxdb.com/docs/v0.9/query_language/data_exploration.html#the-group-by-clause
batch .groupBy(time(10s), 'tag1', 'tag2'))
How far back in time to query from the current time
For example an Offest of 2 hours and an Every of 5m, Kapacitor will query InfluxDB every 5 minutes for the window of data 2 hours ago.
This applies to Cron schedules as well. If the cron specifies to run every Sunday at 1 AM and the Offset is 1 hour. Then at 1 AM on Sunday the data from 12 AM will be queried.
The period or length of time that will be queried from InfluxDB
Chaining methods create a new node in the pipeline as a child of the calling node. They do not modify the calling node.
Create an alert node, which can trigger alerts.
Create a new node that computes the derivative of adjacent points.
Create an eval node that will evaluate the given transformation function to each data point. A list of expressions may be provided and will be evaluated in the order they are given and results of previous expressions are made available to later expressions.
Create an http output node that caches the most recent data it has received. The cached data is available at the given endpoint. The endpoint is the relative path from the API endpoint of the running task. For example if the task endpoint is at "/api/v1/task/<task_name>" and endpoint is "top10", then the data can be requested from "/api/v1/task/<task_name>/top10".
Create an influxdb output node that will store the incoming data into InfluxDB.
Join this node with other nodes. The data is joined on timestamp.
Perform a map-reduce operation on the data.
The built-in functions under
influxql provide the
selection,aggregation, and transformation functions
from the InfluxQL language.
MapReduce may be applied to either a batch or a stream edge. In the case of a batch each batch is passed to the mapper idependently. In the case of a stream all incoming data points that have the exact same time are combined into a batch and sent to the mapper.
Create a new node that samples the incoming points or batches.
One point will be emitted every count or duration specified.
Perform the union of this node and all other given nodes.
Create a new node that filters the data stream by a given expression.
Create a new node that windows the stream by time.
NOTE: Window can only be applied to stream edges.