--- build: list: never publishResources: false render: never sitemap: disable: true --- [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/) and [single-node VictoriaMetrics](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) can aggregate incoming [samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) in streaming mode **by time** and **by labels** before data is written to remote storage (or local storage for single-node VictoriaMetrics). The aggregation is applied to all the metrics received via any [supported data ingestion protocol](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-import-time-series-data) and/or scraped from [Prometheus-compatible targets](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-scrape-prometheus-exporters-such-as-node-exporter), and allows building [flexible processing pipelines](#routing). > By default, stream aggregation ignores timestamps associated with the input [samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples). It expects that the ingested samples have timestamps close to the current time. See [how to ignore old samples](#ignoring-old-samples). > If `-streamAggr.dedupInterval` is enabled, out-of-order samples (older than already received) within the configured interval are treated as duplicates and ignored. See [de-duplication](#deduplication). # Use cases Stream aggregation can be used in the following cases: * [Statsd alternative](#statsd-alternative) * [Recording rules alternative](#recording-rules-alternative) * [Reducing the number of stored samples](#reducing-the-number-of-stored-samples) * [Reducing the number of stored series](#reducing-the-number-of-stored-series) ## Statsd alternative Stream aggregation can be used as [statsd](https://github.com/statsd/statsd) alternative in the following cases: * [Counting input samples](#counting-input-samples) * [Summing input metrics](#summing-input-metrics) * [Quantiles over input metrics](#quantiles-over-input-metrics) * [Histograms over input metrics](#histograms-over-input-metrics) * [Aggregating histograms](#aggregating-histograms) Currently, streaming aggregation is available only for [supported data ingestion protocols](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-import-time-series-data) and not available for [Statsd metrics format](https://github.com/statsd/statsd/blob/master/docs/metric_types.md). ## Recording rules alternative Sometimes [alerting queries](https://docs.victoriametrics.com/victoriametrics/vmalert/#alerting-rules) may require non-trivial amounts of CPU, RAM, disk IO and network bandwidth at metrics storage side. For example, if `http_request_duration_seconds` histogram is generated by thousands of application instances, then the alerting query `histogram_quantile(0.99, sum(increase(http_request_duration_seconds_bucket[5m])) without (instance)) > 0.5` can become slow, since it needs to scan too big number of unique [time series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series) with `http_request_duration_seconds_bucket` name. This alerting query can be accelerated by pre-calculating the `sum(increase(http_request_duration_seconds_bucket[5m])) without (instance)` via [recording rule](https://docs.victoriametrics.com/victoriametrics/vmalert/#recording-rules). But this recording rule may take too much time to execute too. In this case the slow recording rule can be substituted with the following [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config): ```yaml - match: 'http_request_duration_seconds_bucket' interval: 5m without: [instance] outputs: [total] ``` This stream aggregation generates `http_request_duration_seconds_bucket:5m_without_instance_total` output series according to [output metric naming](#output-metric-names). Then these series can be used in [alerting rules](https://docs.victoriametrics.com/victoriametrics/vmalert/#alerting-rules): ```metricsql histogram_quantile(0.99, last_over_time(http_request_duration_seconds_bucket:5m_without_instance_total[5m])) > 0.5 ``` This query is executed much faster than the original query, because it needs to scan much lower number of time series. See [the list of aggregate output](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs), which can be specified at `output` field. See also [aggregating by labels](#aggregating-by-labels). Field `interval` is recommended to be set to a value at least several times higher than your metrics collect interval. ## Reducing the number of stored samples If per-[series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series) samples are ingested at high frequency, then this may result in high disk space usage, since too much data must be stored to disk. This also may result in slow queries, since too much data must be processed during queries. This can be fixed with the stream aggregation by increasing the interval between per-series samples stored in the database. For example, the following [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) reduces the frequency of input samples to one sample per 5 minutes per each input time series (this operation is also known as downsampling): ```yaml # Aggregate metrics ending with _total with `total` output. # See https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs - match: '{__name__=~".+_total"}' interval: 5m outputs: [total] # Downsample other metrics with `count_samples`, `sum_samples`, `min` and `max` outputs # See https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs - match: '{__name__!~".+_total"}' interval: 5m outputs: [count_samples, sum_samples, min, max] ``` The aggregated output metrics have the following names according to [output metric naming](#output-metric-names): ```text # For input metrics ending with _total some_metric_total:5m_total # For input metrics not ending with _total some_metric:5m_count_samples some_metric:5m_sum_samples some_metric:5m_min some_metric:5m_max ``` See [the list of aggregate output](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs), which can be specified at `output` field. See also [aggregating histograms](#aggregating-histograms) and [aggregating by labels](#aggregating-by-labels). ## Reducing the number of stored series Sometimes applications may generate too many [time series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series). For example, the `http_requests_total` metric may have `path` or `user` label with too big number of unique values. In this case the following stream aggregation can be used for reducing the number metrics stored in VictoriaMetrics: ```yaml - match: 'http_requests_total' interval: 30s without: [path, user] outputs: [total] ``` This config specifies labels, which must be removed from the aggregate output, in the `without` list. See [these docs](#aggregating-by-labels) for more details. The aggregated output metric has the following name according to [output metric naming](#output-metric-names): ```text http_requests_total:30s_without_path_user_total ``` See [the list of aggregate output](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs), which can be specified at `output` field. See also [aggregating histograms](#aggregating-histograms). ## Counting input samples If the monitored application generates event-based metrics, then it may be useful to count the number of such metrics at stream aggregation level. For example, if an advertising server generates `hits{some="labels"} 1` and `clicks{some="labels"} 1` metrics per each incoming hit and click, then the following [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) can be used for counting these metrics per every 30 second interval: ```yaml - match: '{__name__=~"hits|clicks"}' interval: 30s outputs: [count_samples] ``` This config generates the following output metrics for `hits` and `clicks` input metrics according to [output metric naming](#output-metric-names): ```text hits:30s_count_samples count1 clicks:30s_count_samples count2 ``` See [the list of aggregate output](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs), which can be specified at `output` field. See also [aggregating by labels](#aggregating-by-labels). ## Summing input metrics If the monitored application calculates some events and then sends the calculated number of events to VictoriaMetrics at irregular intervals or at too high frequency, then stream aggregation can be used for summing such events and writing the aggregate sums to the storage at regular intervals. For example, if an advertising server generates `hits{some="labels} N` and `clicks{some="labels"} M` metrics at irregular intervals, then the following [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) can be used for summing these metrics per every minute: ```yaml - match: '{__name__=~"hits|clicks"}' interval: 1m outputs: [sum_samples] ``` This config generates the following output metrics according to [output metric naming](#output-metric-names): ```text hits:1m_sum_samples sum1 clicks:1m_sum_samples sum2 ``` See [the list of aggregate output](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs), which can be specified at `output` field. See also [aggregating by labels](#aggregating-by-labels). ## Quantiles over input metrics If the monitored application generates measurement metrics per each request, then it may be useful to calculate the pre-defined set of [percentiles](https://en.wikipedia.org/wiki/Percentile) over these measurements. For example, if the monitored application generates `request_duration_seconds N` and `response_size_bytes M` metrics per each incoming request, then the following [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) can be used for calculating 50th and 99th percentiles for these metrics every 30 seconds: ```yaml - match: - request_duration_seconds - response_size_bytes interval: 30s outputs: ["quantiles(0.50, 0.99)"] ``` This config generates the following output metrics according to [output metric naming](#output-metric-names): ```text request_duration_seconds:30s_quantiles{quantile="0.50"} value1 request_duration_seconds:30s_quantiles{quantile="0.99"} value2 response_size_bytes:30s_quantiles{quantile="0.50"} value1 response_size_bytes:30s_quantiles{quantile="0.99"} value2 ``` See [the list of aggregate output](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs), which can be specified at `output` field. See also [histograms over input metrics](#histograms-over-input-metrics) and [aggregating by labels](#aggregating-by-labels). ## Histograms over input metrics If the monitored application generates measurement metrics per each request, then it may be useful to calculate a [histogram](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#histogram) over these metrics. For example, if the monitored application generates `request_duration_seconds N` and `response_size_bytes M` metrics per each incoming request, then the following [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) can be used for calculating [VictoriaMetrics histogram buckets](https://valyala.medium.com/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350) for these metrics every 60 seconds: ```yaml - match: - request_duration_seconds - response_size_bytes interval: 60s outputs: [histogram_bucket] ``` This config generates the following output metrics according to [output metric naming](#output-metric-names). ```text request_duration_seconds:60s_histogram_bucket{vmrange="start1...end1"} count1 request_duration_seconds:60s_histogram_bucket{vmrange="start2...end2"} count2 ... request_duration_seconds:60s_histogram_bucket{vmrange="startN...endN"} countN response_size_bytes:60s_histogram_bucket{vmrange="start1...end1"} count1 response_size_bytes:60s_histogram_bucket{vmrange="start2...end2"} count2 ... response_size_bytes:60s_histogram_bucket{vmrange="startN...endN"} countN ``` The resulting histogram buckets can be queried with [MetricsQL](https://docs.victoriametrics.com/victoriametrics/metricsql/) in the following ways: 1. An estimated 50th and 99th [percentiles](https://en.wikipedia.org/wiki/Percentile) of the request duration over the last hour: ```metricsql histogram_quantiles("quantile", 0.50, 0.99, sum(increase(request_duration_seconds:60s_histogram_bucket[1h])) by (vmrange)) ``` This query uses [histogram_quantiles](https://docs.victoriametrics.com/victoriametrics/metricsql/#histogram_quantiles) function. 1. An estimated [standard deviation](https://en.wikipedia.org/wiki/Standard_deviation) of the request duration over the last hour: ```metricsql histogram_stddev(sum(increase(request_duration_seconds:60s_histogram_bucket[1h])) by (vmrange)) ``` This query uses [histogram_stddev](https://docs.victoriametrics.com/victoriametrics/metricsql/#histogram_stddev) function. 1. An estimated share of requests with the duration smaller than `0.5s` over the last hour: ```metricsql histogram_share(0.5, sum(increase(request_duration_seconds:60s_histogram_bucket[1h])) by (vmrange)) ``` This query uses [histogram_share](https://docs.victoriametrics.com/victoriametrics/metricsql/#histogram_share) function. See [the list of aggregate output](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs), which can be specified at `output` field. See also [quantiles over input metrics](#quantiles-over-input-metrics) and [aggregating by labels](#aggregating-by-labels). ## Aggregating histograms [Histogram](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#histogram) is a set of [counter](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#counter) metrics with different `vmrange` or `le` labels. Since typical usage of histograms is to calculate quantiles over the buckets change via [histogram_quantile](https://docs.victoriametrics.com/victoriametrics/metricsql/#histogram_quantile) function the appropriate aggregation output for this is [total](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#rate_sum): ```yaml - match: 'http_request_duration_seconds_bucket' interval: 5m without: [instance] enable_windows: true outputs: [rate_sum] ``` This config generates the following output metrics according to [output metric naming](#output-metric-names): ```text http_request_duration_seconds_bucket:5m_without_instance_rate_sum{le="0.1"} value1 http_request_duration_seconds_bucket:5m_without_instance_rate_sum{le="0.2"} value2 http_request_duration_seconds_bucket:5m_without_instance_rate_sum{le="0.4"} value3 http_request_duration_seconds_bucket:5m_without_instance_rate_sum{le="1"} value4 http_request_duration_seconds_bucket:5m_without_instance_rate_sum{le="3"} value5 http_request_duration_seconds_bucket:5m_without_instance_rate_sum{le="+Inf"} value6 ``` The resulting metrics can be passed to [histogram_quantile](https://docs.victoriametrics.com/victoriametrics/metricsql/#histogram_quantile) function: ```metricsql histogram_quantile(0.9, sum(http_request_duration_seconds_bucket:5m_without_instance_rate_sum) by(le)) ``` Please note, histograms can be aggregated if their `le` labels are configured identically. [VictoriaMetrics histogram buckets](https://valyala.medium.com/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350) have no such requirement. It's recommended to use [aggregation windows](#aggregation-windows) when aggregating histograms if you observe [accuracy issues](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4580). See [the list of aggregate output](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs), which can be specified at `output` field. See also [histograms over input metrics](#histograms-over-input-metrics) and [quantiles over input metrics](#quantiles-over-input-metrics). # Routing [Single-node VictoriaMetrics](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) supports relabeling, deduplication and stream aggregation for all the received data, scraped or pushed. The processed data is then stored in local storage and **can't be forwarded further**. [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/) supports relabeling, deduplication and stream aggregation for all the received data, scraped or pushed. See the [processing order for vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/#life-of-a-sample). Typical scenarios for data routing with `vmagent`: 1. **Aggregate incoming data and replicate to N destinations**. Specify [`-streamAggr.config`](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#configuration) command-line flag to aggregate the incoming data before replicating it to all the configured `-remoteWrite.url` destinations. 2. **Individually aggregate incoming data for each destination**. Specify [`-remoteWrite.streamAggr.config`](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#configuration) command-line flag for each `-remoteWrite.url` destination. [Relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/) via `-remoteWrite.urlRelabelConfig` can be used for routing only the selected metrics to each `-remoteWrite.url` destination. # Deduplication [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/) supports online [de-duplication](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#deduplication) of samples before sending them to the configured `-remoteWrite.url`. The de-duplication can be enabled via the following options: - By specifying the desired de-duplication interval via `-streamAggr.dedupInterval` command-line flag for all received data or via `-remoteWrite.streamAggr.dedupInterval` command-line flag for the particular `-remoteWrite.url` destination. For example, `./vmagent -remoteWrite.url=http://remote-storage/api/v1/write -remoteWrite.streamAggr.dedupInterval=30s` instructs `vmagent` to leave only the last sample per each seen [time series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series) per every 30 seconds. The de-deduplication is performed after applying [relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/) and before performing the aggregation. - By specifying `dedup_interval` option individually per each [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) in `-remoteWrite.streamAggr.config` or `-streamAggr.config` configs. [Single-node VictoriaMetrics](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) supports two types of de-duplication: - After storing the duplicate samples to local storage. See [`-dedup.minScrapeInterval`](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#deduplication) command-line option. - Before storing the duplicate samples to local storage. This type of de-duplication can be enabled via the following options: - By specifying the desired de-duplication interval via `-streamAggr.dedupInterval` command-line flag. For example, `./victoria-metrics -streamAggr.dedupInterval=30s` instructs VictoriaMetrics to leave only the last sample per each seen [time series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series) per every 30 seconds. The de-duplication is performed after applying `-relabelConfig` [relabeling](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#relabeling). - By specifying `dedup_interval` option individually per each [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) at `-streamAggr.config`. It is possible to drop the given labels before applying the de-duplication. See [these docs](#dropping-unneeded-labels). The online de-duplication uses the same logic as [`-dedup.minScrapeInterval` command-line flag](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#deduplication) at VictoriaMetrics. De-deuplication is applied before stream aggreation rules and can drop samples before they get matched for aggregation. # Relabeling It is possible to apply [arbitrary relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/) to input and output metrics during stream aggregation via `input_relabel_configs` and `output_relabel_configs` options in [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). Relabeling rules inside `input_relabel_configs` are applied to samples matching the `match` filters before optional [deduplication](#deduplication). Relabeling rules inside `output_relabel_configs` are applied to aggregated samples before sending them to the remote storage. For example, the following config removes the `:1m_sum_samples` suffix added [to the output metric name](#output-metric-names): ```yaml - interval: 1m outputs: [sum_samples] output_relabel_configs: - source_labels: [__name__] target_label: __name__ regex: "(.+):.+" ``` Another option to remove the suffix, which is added by stream aggregation, is to add `keep_metric_names: true` to the config: ```yaml - interval: 1m outputs: [sum_samples] keep_metric_names: true ``` See also [dropping unneeded labels](#dropping-unneeded-labels). # Advanced usage ## Ignoring old samples By default, all the input samples are taken into account during stream aggregation. If samples with old timestamps outside the current [aggregation interval](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) must be ignored, then the following options can be used: - To pass `-streamAggr.ignoreOldSamples` command-line flag to [single-node VictoriaMetrics](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) or to [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/). At [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/) `-remoteWrite.streamAggr.ignoreOldSamples` flag can be specified individually per each `-remoteWrite.url`. This enables ignoring old samples for all the [aggregation configs](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). - To set `ignore_old_samples: true` option at the particular [aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). This enables ignoring old samples for that particular aggregation config. ## Ignore aggregation intervals on start Streaming aggregation results may be incorrect for some time after the restart of [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/) or [single-node VictoriaMetrics](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) until all the buffered [samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) are sent from remote sources to the `vmagent` or single-node VictoriaMetrics via [supported data ingestion protocols](https://docs.victoriametrics.com/victoriametrics/vmagent/#how-to-push-data-to-vmagent). In this case it may be a good idea to drop the aggregated data during the first `N` [aggregation intervals](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) just after the restart of `vmagent` or single-node VictoriaMetrics. This can be done via the following options: - The `-streamAggr.ignoreFirstIntervals=N` command-line flag at `vmagent` and single-node VictoriaMetrics. This flag instructs skipping the first `N` [aggregation intervals](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) just after the restart across all the [configured stream aggregation configs](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/). The `-remoteWrite.streamAggr.ignoreFirstIntervals` command-line flag can be specified individually per each `-remoteWrite.url` at [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/). - The `ignore_first_intervals: N` option at the particular [aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). See also: - [Flush time alignment](#flush-time-alignment) - [Ignoring old samples](#ignoring-old-samples) ## Flush time alignment By default, the time for aggregated data flush is aligned by the `interval` option specified in [aggregate config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). For example: - if `interval: 1m` is set, then the aggregated data is flushed to the storage at the end of every minute - if `interval: 1h` is set, then the aggregated data is flushed to the storage at the end of every hour If you do not need such an alignment, then set `no_align_flush_to_interval: true` option in the [aggregate config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). In this case aggregated data flushes will be aligned to the `vmagent` start time or to [config reload](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#configuration-update) time. The aggregated data on the first and the last interval is dropped during `vmagent` start, restart or [config reload](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#configuration-update), since the first and the last aggregation intervals are incomplete, so they usually contain incomplete confusing data. If you need preserving the aggregated data on these intervals, then set `flush_on_shutdown: true` option in the [aggregate config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). See also: - [Ignore aggregation intervals on start](#ignore-aggregation-intervals-on-start) - [Ignoring old samples](#ignoring-old-samples) ## Output metric names Output metric names for stream aggregation are constructed according to the following pattern: ```text :[_by_][_without_]_ ``` - `` is the original metric name. - `` is the interval specified in the [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). - `` is `_`-delimited sorted list of `by` labels specified in the [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). If the `by` list is missing in the config, then the `_by_` part isn't included in the output metric name. - `` is an optional `_`-delimited sorted list of `without` labels specified in the [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). If the `without` list is missing in the config, then the `_without_` part isn't included in the output metric name. - `` is the aggregate used for constructing the output metric. The aggregate name is taken from the `outputs` list at the corresponding [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). Both input and output metric names can be modified if needed via relabeling according to [these docs](#relabeling). It is possible to leave the original metric name after the aggregation by specifying `keep_metric_names: true` option at [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). The `keep_metric_names` option can be used if only a single output is set in [`outputs` list](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs). ## Aggregating by labels All the labels for the input metrics are preserved by default in the output metrics. For example, the input metric `foo{app="bar",instance="host1"}` results to the output metric `foo:1m_sum_samples{app="bar",instance="host1"}` when the following [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) is used: ```yaml - interval: 1m outputs: [sum_samples] ``` The input labels can be removed via `without` list specified in the config. For example, the following config removes the `instance` label from output metrics by summing input samples across all the instances: ```yaml - interval: 1m without: [instance] outputs: [sum_samples] ``` In this case the `foo{app="bar",instance="..."}` input metrics are transformed into `foo:1m_without_instance_sum_samples{app="bar"}` output metric according to [output metric naming](#output-metric-names). It is possible specifying the exact list of labels in the output metrics via `by` list. For example, the following config sums input samples by the `app` label: ```yaml - interval: 1m by: [app] outputs: [sum_samples] ``` In this case the `foo{app="bar",instance="..."}` input metrics are transformed into `foo:1m_by_app_sum_samples{app="bar"}` output metric according to [output metric naming](#output-metric-names). The labels used in `by` and `without` lists can be modified via `input_relabel_configs` section - see [these docs](#relabeling). See also [aggregation outputs](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs). ## Dropping unneeded labels To optimize performance and reduce [churn rate](https://docs.victoriametrics.com/guides/understand-your-setup-size/#churn-rate), it's important to drop unnecessary labels from incoming samples. Dropping unnecessary labels can significantly enhance efficiency. There are various strategies for label dropping, which can be implemented individually or combined. **Global Label Dropping** is configured using the `-streamAggr.dropInputLabels` flag. It works in conjunction with the `-streamAggr.config` flag and applies to all matching sections within it. The labels are dropped before [input relabeling](#relabeling), [deduplication](#deduplication), and [stream aggregation](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs) are applied. This flag can be used with [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/), vminsert, and [vmsingle](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/). The following example demonstrates how to drop the `replica` and `az` labels for both `foo` and `bar` remote write targets: ```bash /path/to/vmagent \ -remoteWrite.url="http://foo/api/v1/write" \ -remoteWrite.url="http://bar/api/v1/write" \ -streamAggr.config="aggr.yaml" \ -streamAggr.dropInputLabels="replica,az" ``` **Per Remote Write Label Drop** is configured using the `-remoteWrite.streamAggr.dropInputLabels` flag. It should be defined as many times as there are `-remoteWrite.url` flags. To drop multiple labels for a remote write, use `^^` to separate them. The labels are dropped before [input relabeling](#relabeling), [de-duplication](#deduplication), and [stream aggregation](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs) are applied. This flag is available for [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/) only. In the example below, `replica` and `az` are dropped for the `foo` target, while `instance` is dropped for the `bar` target: ```bash /path/to/vmagent \ -remoteWrite.url="http://foo/api/v1/write" \ -remoteWrite.url="http://bar/api/v1/write" \ -remoteWrite.streamAggr.config="aggr.yaml" \ -remoteWrite.streamAggr.dropInputLabels="replica^^az" \ -remoteWrite.streamAggr.dropInputLabels="instance" ``` **Config based label drop** can be defined within the [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) using the `drop_input_labels` key. This method applies to configurations provided via either the `-streamAggr.config` or `-remoteWrite.streamAggr.config` flag. When specified, `drop_input_labels` takes precedence over any label drop definitions set via flags. Below is an example of an `aggr.yaml` configuration that drops the `replica` and `az` labels from `process_resident_memory_bytes` metrics: ```yaml - match: 'process_resident_memory_bytes' interval: '1m' drop_input_labels: ['replica', 'az'] outputs: ['avg'] keep_metric_names: true ``` # Troubleshooting - [Unexpected spikes for `total` or `increase` outputs](#staleness). - [Lower than expected values for `total_prometheus` and `increase_prometheus` outputs](#staleness). - [High memory usage and CPU usage](#high-resource-usage). - [Unexpected results in vmagent cluster mode](#cluster-mode). - [Inaccurate aggregation results for histograms](#aggregation-windows) ## Aggregation windows By default, stream aggregation and deduplication stores a single state per each aggregation output result. The data for each aggregator is flushed independently once per aggregation interval. But there's no guarantee that incoming samples with timestamps close to the aggregation interval's end will get into it. For example, when aggregating with `interval: 1m` a data sample with timestamp 1739473078 (18:57:59) can fall into aggregation round `18:58:00` or `18:59:00`. It depends on network lag, load, clock synchronization, etc. In most scenarios it doesn't impact aggregation or deduplication results, which are consistent within margin of error. But for metrics represented as a collection of series, like [histograms](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#histogram), such inaccuracy leads to invalid aggregation results. For this case, streaming aggregation and deduplication support mode with aggregation windows for current and previous state. With this mode, flush doesn't happen immediately but is shifted by a calculated samples lag that improves correctness for delayed data. {{% available_from "v1.112.0" %}} Enabling of this mode has increased resource usage: memory usage is expected to double as aggregation will store two states instead of one. However, this significantly improves accuracy of calculations. Aggregation windows can be enabled via the following settings: - `-streamAggr.enableWindows` at [single-node VictoriaMetrics](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) and [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/). At [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/) `-remoteWrite.streamAggr.enableWindows` flag can be specified individually per each `-remoteWrite.url`. If one of these flags is set, then all aggregators will be using fixed windows. In conjunction with `-remoteWrite.streamAggr.dedupInterval` or `-streamAggr.dedupInterval` fixed aggregation windows are enabled on deduplicator as well. - `enable_windows` option in [aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#stream-aggregation-config). It allows enabling aggregation windows for a specific aggregator. ## Staleness The following outputs track the last seen per-series values in order to properly calculate output values: - [histogram_bucket](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#histogram_bucket) - [increase](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#increase) - [increase_prometheus](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#increase_prometheus) - [rate_avg](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#rate_avg) - [rate_sum](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#rate_sum) - [total](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#total) - [total_prometheus](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#total_prometheus) The last seen per-series value is dropped if no new samples are received for the given time series during two consecutive aggregation intervals specified in [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config) via `interval` option. If a new sample for the existing time series is received after that, then it is treated as the first sample for a new time series. This may lead to the following issues: - Lower than expected results for [total_prometheus](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#total_prometheus) and [increase_prometheus](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#increase_prometheus) outputs, since they ignore the first sample in a new time series. - Unexpected spikes for [total](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#total) and [increase](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#increase) outputs, since they assume that new time series start from 0. These issues can be fixed in the following ways: - By increasing the `interval` option at [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config), so it covers the expected delays in data ingestion pipelines. - By specifying the `staleness_interval` option at [stream aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config), so it covers the expected delays in data ingestion pipelines. By default, the `staleness_interval` equals to `2 x interval`. ## High resource usage The following solutions can help reducing memory usage and CPU usage during streaming aggregation: - To use more specific `match` filters at [streaming aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config), so only the really needed [raw samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) are aggregated. - To increase aggregation interval by specifying bigger duration for the `interval` option at [streaming aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). - To generate lower number of output time series by using less specific [`by` list](#aggregating-by-labels) or more specific [`without` list](#aggregating-by-labels). - To drop unneeded long labels in input samples via [input_relabel_configs](#relabeling). ## Cluster mode If you use [vmagent in cluster mode](https://docs.victoriametrics.com/victoriametrics/vmagent/#scraping-big-number-of-targets) for streaming aggregation then be careful when using [`by` or `without` options](#aggregating-by-labels) or when modifying sample labels via [relabeling](#relabeling), since incorrect usage may result in duplicates and data collision. For example, if more than one `vmagent` instance calculates [increase](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#increase) for `http_requests_total` metric with `by: [path]` option, then all the `vmagent` instances will aggregate samples to the same set of time series with different `path` labels. The proper fix would be [adding a unique label](https://docs.victoriametrics.com/victoriametrics/vmagent/#adding-labels-to-metrics) for all the output samples produced by each `vmagent`, so they are aggregated into distinct sets of [time series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series). These time series then can be aggregated later as needed during querying. If `vmagent` instances run in Docker or Kubernetes, then you can refer `POD_NAME` or `HOSTNAME` environment variables as a unique label value per each `vmagent` via `-remoteWrite.label=vmagent=%{HOSTNAME}` command-line flag. See [these docs](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#environment-variables) on how to refer environment variables in VictoriaMetrics components. ## Common mistakes ### Put aggregator behind load balancer When configuring the aggregation rule, make sure that `vmagent` receives all the required data to satisfy the `match` rule. If traffic to the vmagent goes through the load balancer, it could happen that vmagent will be receiving only fraction of the data and produce incomplete aggregations. To keep aggregation results consistent, make sure that vmagent receives all the required data for aggregation. In case if you need to split the load across multiple vmagents, try sharding the traffic among them via metric names or labels. For example, see how vmagent could consistently [shard data across remote write destinations](https://docs.victoriametrics.com/victoriametrics/vmagent/#sharding-among-remote-storages) via `-remoteWrite.shardByURL.labels` or `-remoteWrite.shardByURL.ignoreLabels` cmd-line flags. ### Create aggregator per each recording rule Stream aggregation can be used as alternative for [recording rules](#recording-rules-alternative). But creating an aggregation rule per each recording rule can lead to elevated resource usage on the vmagent, because the ingestion stream should be matched against every configured aggregation rule. To optimize this, we recommend merging together aggregations which only differ in match expressions. For example, let's see the following list of recording rules: ```yaml - expr: sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait",mode!="steal"}[3m])) BY (instance) record: instance:node_cpu:rate:sum - expr: sum(rate(node_network_receive_bytes_total[3m])) BY (instance) record: instance:node_network_receive_bytes:rate:sum - expr: sum(rate(node_network_transmit_bytes_total[3m])) BY (instance) record: instance:node_network_transmit_bytes:rate:sum ``` These rules can be effectively converted into a single aggregation rule: ```yaml - match: - node_cpu_seconds_total{mode!="idle",mode!="iowait",mode!="steal"} - node_network_receive_bytes_total - node_network_transmit_bytes_total interval: 3m outputs: [rate_sum] by: - instance output_relabel_configs: - source_labels: [__name__] target_label: __name__ regex: "(.+):.+" replacement: "instance:$1:rate:sum" ``` **Note**: having separate aggregator for a certain `match` expression can only be justified when aggregator cannot keep up with all the data pushed to an aggregator within an aggregation interval. ### Use identical --remoteWrite.streamAggr.config for all remote writes Each specified `-remoteWrite.streamAggr.config` aggregation config is processed independently on the copy of the data stream. So if you want to aggregate incoming data and replicate it across multiple destinations, it would be more efficient to use a global `-streamAggr.config` instead. In this way, vmagent will perform aggregation only once and then will replicate it across multiple `-remoteWrite.url`. ### Use aggregated metrics like original ones Stream aggregation allows keeping original metric names after aggregation by using `keep_metric_names` setting. But the "meaning" of aggregated metrics is usually different to original ones after the aggregation. Make sure that you updated queries in your alerting rules and dashboards accordingly if you used `keep_metric_names` setting. ### Use different deduplication intervals on storage and vmagent If the storage uses `-dedup.minScrapeInterval` but `vmagent` has no deduplication configured, aggregation results may not match queries on the storage. For example, `sum(rate(foo[1m])) by (instance)` query result can differ from the [rate_sum](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#rate_sum) aggregation result `foo:1m_by_instance_rate_sum`. This happens because vmagent aggregates all samples, while queries on the storage use deduplicated samples. To avoid this, set `-streamAggr.dedupInterval` or `-remoteWrite.streamAggr.dedupInterval` on `vmagent` to match the storage interval. --- Section below contains backward-compatible anchors for links that were moved or renamed. ###### Configuration Moved to [stream-aggregation/configuration](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/). ###### Stream aggregation config Moved to [stream-aggregation/configuration/#stream-aggregation-config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stream-aggregation-config). ###### Configuration update Moved to [stream-aggregation/configuration/#configuration-update](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#configuration-update). ###### Aggregation outputs Moved to [stream-aggregation/configuration/#aggregation-outputs](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#aggregation-outputs). ###### avg Moved to [stream-aggregation/configuration/#avg](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#avg). ###### count_samples Moved to [stream-aggregation/configuration/#count_samples](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#count_samples). ###### count_series Moved to [stream-aggregation/configuration/#count_series](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#count_series). ###### histogram_bucket Moved to [stream-aggregation/configuration/#histogram_bucket](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#histogram_bucket). ###### increase Moved to [stream-aggregation/configuration/#increase](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#increase). ###### increase_prometheus Moved to [stream-aggregation/configuration/#increase_prometheus](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#increase_prometheus). ###### last Moved to [stream-aggregation/configuration/#last](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#last). ###### max Moved to [stream-aggregation/configuration/#max](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#max). ###### min Moved to [stream-aggregation/configuration/#min](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#min). ###### rate_avg Moved to [stream-aggregation/configuration/#rate_avg](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#rate_avg). ###### rate_sum Moved to [stream-aggregation/configuration/#rate_sum](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#rate_sum). ###### stddev Moved to [stream-aggregation/configuration/#stddev](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stddev). ###### stdvar Moved to [stream-aggregation/configuration/#stdvar](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#stdvar). ###### sum_samples Moved to [stream-aggregation/configuration/#sum_samples](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#sum_samples). ###### total Moved to [stream-aggregation/configuration/#total](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#total). ###### total_prometheus Moved to [stream-aggregation/configuration/#total_prometheus](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#total_prometheus). ###### unique_samples Moved to [stream-aggregation/configuration/#unique_samples](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#unique_samples). ###### quantiles Moved to [stream-aggregation/configuration/#quantiles](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/#quantiles).