Status | |
---|---|
Stability | development |
Supported pipeline types | traces |
Distributions | contrib |
Note: Currently experimental and subject to breaking changes (e.g. change from processor to exporter/translator component). See: #403.
Aggregates Request, Error and Duration (R.E.D) metrics from span data.
Request counts are computed as the number of spans seen per unique set of dimensions, including Errors. For example, the following metric shows 142 calls:
calls_total{http_method="GET",http_status_code="200",operation="/Address",service_name="shippingservice",span_kind="SPAN_KIND_SERVER",status_code="STATUS_CODE_UNSET"} 142
Multiple metrics can be aggregated if, for instance, a user wishes to view call counts just on service_name
and operation
.
Error counts are computed from the Request counts which have an "Error" Status Code metric dimension. For example, the following metric indicates 220 errors:
calls_total{http_method="GET",http_status_code="503",operation="/checkout",service_name="frontend",span_kind="SPAN_KIND_CLIENT",status_code="STATUS_CODE_ERROR"} 220
Duration is computed from the difference between the span start and end times and inserted into the relevant latency histogram time bucket for each unique set dimensions. For example, the following latency buckets indicate the vast majority of spans (9K) have a 100ms latency:
latency_bucket{http_method="GET",http_status_code="200",label1="value1",operation="/Address",service_name="shippingservice",span_kind="SPAN_KIND_SERVER",status_code="STATUS_CODE_UNSET",le="2"} 327
latency_bucket{http_method="GET",http_status_code="200",label1="value1",operation="/Address",service_name="shippingservice",span_kind="SPAN_KIND_SERVER",status_code="STATUS_CODE_UNSET",le="6"} 751
latency_bucket{http_method="GET",http_status_code="200",label1="value1",operation="/Address",service_name="shippingservice",span_kind="SPAN_KIND_SERVER",status_code="STATUS_CODE_UNSET",le="10"} 1195
latency_bucket{http_method="GET",http_status_code="200",label1="value1",operation="/Address",service_name="shippingservice",span_kind="SPAN_KIND_SERVER",status_code="STATUS_CODE_UNSET",le="100"} 10180
latency_bucket{http_method="GET",http_status_code="200",label1="value1",operation="/Address",service_name="shippingservice",span_kind="SPAN_KIND_SERVER",status_code="STATUS_CODE_UNSET",le="250"} 10180
...
Each metric will have at least the following dimensions because they are common across all spans:
- Service name
- Operation
- Span kind
- Status code
This processor lets traces to continue through the pipeline unmodified.
The following settings are required:
metrics_exporter
: the name of the exporter that this processor will write metrics to. This exporter must be present in a pipeline.
The following settings can be optionally configured:
-
latency_histogram_buckets
: the list of durations defining the latency histogram buckets.- Default:
[2ms, 4ms, 6ms, 8ms, 10ms, 50ms, 100ms, 200ms, 400ms, 800ms, 1s, 1400ms, 2s, 5s, 10s, 15s]
- Default:
-
dimensions
: the list of dimensions to add together with the default dimensions defined above.Each additional dimension is defined with a
name
which is looked up in the span's collection of attributes or resource attributes (AKA process tags) such asip
,host.name
orregion
.If the
name
d attribute is missing in the span, the optional provideddefault
is used.If no
default
is provided, this dimension will be omitted from the metric. -
dimensions_cache_size
: the max items number ofmetric_key_to_dimensions_cache
. If not provided, will use default value size1000
. -
aggregation_temporality
: Defines the aggregation temporality of the generated metrics. One of eitherAGGREGATION_TEMPORALITY_CUMULATIVE
orAGGREGATION_TEMPORALITY_DELTA
.- Default:
AGGREGATION_TEMPORALITY_CUMULATIVE
- Default:
-
namespace
: Defines the namespace of the generated metrics. Ifnamespace
provided, generated metric name will be addednamespace.
prefix.
The following is a simple example usage of the spanmetrics processor.
For configuration examples on other use cases, please refer to More Examples.
The full list of settings exposed for this processor are documented here.
receivers:
jaeger:
protocols:
thrift_http:
endpoint: "0.0.0.0:14278"
# Dummy receiver that's never used, because a pipeline is required to have one.
otlp/spanmetrics:
protocols:
grpc:
endpoint: "localhost:12345"
otlp:
protocols:
grpc:
endpoint: "localhost:55677"
processors:
batch:
spanmetrics:
metrics_exporter: otlp/spanmetrics
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
dimensions:
- name: http.method
default: GET
- name: http.status_code
dimensions_cache_size: 1000
aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"
exporters:
jaeger:
endpoint: localhost:14250
otlp/spanmetrics:
endpoint: "localhost:55677"
tls:
insecure: true
prometheus:
endpoint: "0.0.0.0:8889"
service:
pipelines:
traces:
receivers: [jaeger]
processors: [spanmetrics, batch]
exporters: [jaeger]
# The exporter name must match the metrics_exporter name.
# The receiver is just a dummy and never used; added to pass validation requiring at least one receiver in a pipeline.
metrics/spanmetrics:
receivers: [otlp/spanmetrics]
exporters: [otlp/spanmetrics]
metrics:
receivers: [otlp]
exporters: [prometheus]
For more example configuration covering various other use cases, please visit the testdata directory.