Skip to content

Commit cab21aa

Browse files
committed
edit
1 parent 0ca0d7e commit cab21aa

File tree

1 file changed

+19
-11
lines changed

1 file changed

+19
-11
lines changed

modules/distr-tracing-tempo-config-spanmetrics.adoc

Lines changed: 19 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -93,30 +93,38 @@ spec:
9393
<1> Enables the monitoring tab in the Jaeger console.
9494
<2> The service name for Thanos Querier from user-workload monitoring.
9595

96-
== Enable alerting on span RED metrics
96+
== Span RED metrics and alerting rules
9797

98-
The metrics generated by the `spanmetrics` connector can be used in alerting rules. For instance to alert on a slow service or define service level objectives (SLOs).
99-
The connector creates `duration_bucket` histogram and `calls` counter metric. These metrics have labels that identify service, API name, operation type and other attributes.
98+
The metrics generated by the `spanmetrics` connector are usable with alerting rules. For example, for alerts about a slow service or to define service level objectives (SLOs), the connector creates a `duration_bucket` histogram and the `calls` counter metric. These metrics have labels that identify the service, API name, operation type, and other attributes.
10099

101-
.Labels present on the metrics created oin the `spanmetrics` connector.
100+
.Labels of the metrics created in the `spanmetrics` connector
102101
[options="header"]
103102
[cols="l, a, a"]
104103
|===
105104
|Label |Description |Values
105+
106106
|service_name
107-
| Service name set by `otel_service_name` environment variable.
107+
|Service name set by the `otel_service_name` environment variable.
108108
|`frontend`
109109

110110
|span_name
111111
| Name of the operation.
112-
|`/`, `/customer`
112+
|
113+
* `/`
114+
* `/customer`
113115

114116
|span_kind
115-
| Span kind identifies the server, client, messaging or internal operation.
116-
|`SPAN_KIND_SERVER`, `SPAN_KIND_CLIENT`, `SPAN_KIND_PRODUCER`, `SPAN_KIND_CONSUMER`, `SPAN_KIND_INTERNAL`
117+
|Identifies the server, client, messaging, or internal operation.
118+
|
119+
* `SPAN_KIND_SERVER`
120+
* `SPAN_KIND_CLIENT`
121+
* `SPAN_KIND_PRODUCER`
122+
* `SPAN_KIND_CONSUMER`
123+
* `SPAN_KIND_INTERNAL`
124+
117125
|===
118126

119-
.PrometheusRule custom resource to define an alert for SLO to serve 95% of requests within 2000ms on the frontend service.
127+
.Example PrometheusRule CR that defines an alerting rule for SLO when not serving 95% of requests within 2000ms on the front-end service
120128
[source,yaml]
121129
----
122130
apiVersion: monitoring.coreos.com/v1
@@ -128,11 +136,11 @@ spec:
128136
- name: server-side-latency
129137
rules:
130138
- alert: SpanREDFrontendAPIRequestLatency
131-
expr: histogram_quantile(0.95, sum(rate(duration_bucket{service_name="frontend", span_kind="SPAN_KIND_SERVER"}[5m])) by (le, service_name, span_name)) > 2000 <1>
139+
expr: histogram_quantile(0.95, sum(rate(duration_bucket{service_name="frontend", span_kind="SPAN_KIND_SERVER"}[5m])) by (le, service_name, span_name)) > 2000 # <1>
132140
labels:
133141
severity: Warning
134142
annotations:
135143
summary: "High request latency on {{$labels.service_name}} and {{$labels.span_name}}"
136144
description: "{{$labels.instance}} has 95th request latency above 2s (current value: {{$value}}s)"
137145
----
138-
<1> The expression to check if 95% of frontend server response time is below 2000 ms. The time range (`[5m]`) should be at least four times the scrape interval and long enough to accommodate change in the metric.
146+
<1> The expression for checking if 95% of the front-end server response time values are below 2000 ms. The time range (`[5m]`) must be at least four times the scrape interval and long enough to accommodate a change in the metric.

0 commit comments

Comments
 (0)