|
| 1 | +# Tracing in {{ ydb-short-name }} |
| 2 | + |
| 3 | +{% note info %} |
| 4 | + |
| 5 | +The [OpenTelemetry](https://opentelemetry.io/) website describes the concept of tracing in detail in the [Observability Primer](https://opentelemetry.io/docs/concepts/observability-primer/) article. |
| 6 | + |
| 7 | +{% endnote %} |
| 8 | + |
| 9 | +Tracing is a tool that allows you to view the detailed path of a request through a distributed system. A set of spans describes the path of a single request (trace). A span is a time segment usually associated with the execution time of a specific operation (e.g., writing information to disk or executing a transaction). Spans form a tree, often with the subtree of a span as its detail, but this is not always the case. |
| 10 | + |
| 11 | + |
| 12 | + |
| 13 | +To aggregate disparate spans into traces, they are sent to a *collector*. This service aggregates and stores received spans for subsequent trace analysis. {{ ydb-short-name }} does not include this service; the administrator must set it up independently. Typically, [Jaeger](https://www.jaegertracing.io/) is used as a collector. |
| 14 | + |
| 15 | +## Minimal configuration |
| 16 | + |
| 17 | +To enable tracing in {{ ydb-short-name }}, add the following section to the [configuration](../../../deploy/configuration/config.md): |
| 18 | + |
| 19 | +```yaml |
| 20 | +tracing_config: |
| 21 | + backend: |
| 22 | + opentelemetry: |
| 23 | + collector_url: grpc://example.com:4317 |
| 24 | + service_name: ydb |
| 25 | + external_throttling: |
| 26 | + - max_traces_per_minute: 10 |
| 27 | +``` |
| 28 | +
|
| 29 | +Here, the `collector_url` field sets the URL of an [OTLP-compatible](https://opentelemetry.io/docs/specs/otlp/) span collector. More details on the backend section can be found in the [relevant section](./setup.md#backend). |
| 30 | + |
| 31 | +With this configuration, no requests are sampled, and no more than ten requests per minute with an [external trace-id](./external-traces.md) are traced by each cluster node. |
| 32 | + |
| 33 | +## Section descriptions |
| 34 | + |
| 35 | +### Backend {#backend} |
| 36 | + |
| 37 | +#### Example section |
| 38 | + |
| 39 | +```yaml |
| 40 | +tracing_config: |
| 41 | + # ... |
| 42 | + backend: |
| 43 | + opentelemetry: |
| 44 | + collector_url: grpc://example.com:4317 |
| 45 | + service_name: ydb |
| 46 | +``` |
| 47 | + |
| 48 | +#### Description |
| 49 | + |
| 50 | +This section describes the span collector. Currently, the only option is `opentelemetry`. Spans are pushed from the cluster node to the collector, requiring the collector to be [OTLP](https://opentelemetry.io/docs/specs/otlp/) compatible. |
| 51 | + |
| 52 | +In the `opentelemetry` section: |
| 53 | +* `collector_url` — the URL of the span collector. The scheme can be either `grpc://` for an insecure connection or `grpcs://` for a TLS connection. |
| 54 | +* `service_name` — the name under which all spans will be marked. |
| 55 | + |
| 56 | +Both parameters are mandatory. |
| 57 | + |
| 58 | +### Uploader {#uploader} |
| 59 | + |
| 60 | +#### Example section |
| 61 | + |
| 62 | +```yaml |
| 63 | +tracing_config: |
| 64 | + # ... |
| 65 | + uploader: |
| 66 | + max_exported_spans_per_second: 30 |
| 67 | + max_spans_in_batch: 100 |
| 68 | + max_bytes_in_batch: 10485760 # 10 MiB |
| 69 | + max_export_requests_inflight: 3 |
| 70 | + max_batch_accumulation_milliseconds: 5000 |
| 71 | + span_export_timeout_seconds: 120 |
| 72 | +``` |
| 73 | + |
| 74 | +#### Description |
| 75 | + |
| 76 | +The uploader is a cluster node component responsible for sending spans to the collector. To avoid overloading the span collector, the uploader will not send more than `max_exported_spans_per_second` spans per second on average. |
| 77 | + |
| 78 | +For optimization, the uploader sends spans in batches. Each batch contains no more than `max_spans_in_batch` spans with a total serialized size of no more than `max_bytes_in_batch` bytes. Each batch accumulates for no more than `max_batch_accumulation_milliseconds` milliseconds. Batches can be sent in parallel, with the maximum number of simultaneously sent batches controlled by the `max_export_requests_inflight` parameter. If more than `span_export_timeout_seconds` seconds have passed since the uploader received the span, the uploader may delete it to send fresher spans. |
| 79 | + |
| 80 | +Default values: |
| 81 | +* `max_exported_spans_per_second = inf` (no limits) |
| 82 | +* `max_spans_in_batch = 150` |
| 83 | +* `max_bytes_in_batch = 20000000` |
| 84 | +* `max_batch_accumulation_milliseconds = 1000` |
| 85 | +* `span_export_timeout_seconds = inf` (no spans are deleted by the uploader) |
| 86 | +* `max_export_requests_inflight = 1` |
| 87 | + |
| 88 | +The `uploader` section may be completely absent, in which case each parameter will use its default value. |
| 89 | + |
| 90 | +{% note info %} |
| 91 | + |
| 92 | +The uploader is a node-local component. Therefore, the described limits apply to each node separately, not to the entire cluster. |
| 93 | + |
| 94 | +{% endnote %} |
| 95 | + |
| 96 | +### External throttling {#external-throttling} |
| 97 | + |
| 98 | +#### Example section |
| 99 | + |
| 100 | +```yaml |
| 101 | +tracing_config: |
| 102 | + # ... |
| 103 | + external_throttling: |
| 104 | + - scope: |
| 105 | + database: /Root/db1 |
| 106 | + max_traces_per_minute: 60 |
| 107 | + max_traces_burst: 3 |
| 108 | +``` |
| 109 | + |
| 110 | +#### Description |
| 111 | + |
| 112 | +{{ ydb-short-name }} supports the transmission of external trace-ids to build a coherent request trace. The method for transmitting an external trace-id is described on the [{#T}](./external-traces.md) page. To avoid overloading the collector, {{ ydb-short-name }} has a mechanism to limit the number of externally traced requests. The limits are described in this section and are a sequence of rules. Each rule contains: |
| 113 | + |
| 114 | +* `scope` – a set of selectors for filtering the request. |
| 115 | +* `max_traces_per_minute` – the maximum average number of requests per minute traced by this rule. A positive integer is expected. |
| 116 | +* `max_traces_burst` – the maximum burst of externally traced requests. A non-negative integer is expected. |
| 117 | + |
| 118 | +The only mandatory parameter is `max_traces_per_minute`. |
| 119 | + |
| 120 | +A detailed description of these options is provided in the [{#T}](./setup.md#semantics) section. |
| 121 | + |
| 122 | +The `external_throttling` section is not mandatory; if it is absent, all trace-ids in requests are **ignored** (no external traces are continued). |
| 123 | + |
| 124 | +This section can be modified without restarting the node using the [dynamic configuration](../../../maintenance/manual/dynamic-config.md) mechanism. |
| 125 | + |
| 126 | +### Sampling |
| 127 | + |
| 128 | +#### Example section |
| 129 | + |
| 130 | +```yaml |
| 131 | +tracing_config: |
| 132 | + # ... |
| 133 | + sampling: |
| 134 | + - fraction: 0.01 |
| 135 | + level: 10 |
| 136 | + max_traces_per_minute: 5 |
| 137 | + max_traces_burst: 2 |
| 138 | + - scope: |
| 139 | + request_types: |
| 140 | + - KeyValue.ExecuteTransaction |
| 141 | + - KeyValue.Read |
| 142 | + fraction: 0.1 |
| 143 | + level: 15 |
| 144 | + max_traces_per_minute: 5 |
| 145 | + max_traces_burst: 2 |
| 146 | +``` |
| 147 | + |
| 148 | +#### Description |
| 149 | + |
| 150 | +For diagnosing system issues, looking at a sample request trace can be useful regardless of whether users trace their requests or not. For this purpose, {{ ydb-short-name }} has a request sampling mechanism. For a sampled request, a random trace-id is generated. This section controls request sampling in a format similar to [`external_throttling`](./setup.md#external-throttling). Each rule has two additional fields: |
| 151 | + |
| 152 | +* `fraction` – the fraction of requests sampled by this rule. A floating-point number between 0 and 1 is expected. |
| 153 | +* `level` — the detail level of the trace. An integer from 0 to 15 is expected. This parameter is described in more detail in the [{#T}](./setup.md#tracing-levels) section. |
| 154 | + |
| 155 | +Both fields are mandatory. |
| 156 | + |
| 157 | +The `sampling` section is not mandatory; no requests will be sampled if it is absent. |
| 158 | + |
| 159 | +This section can be modified without restarting the node using the [dynamic configuration](../../../maintenance/manual/dynamic-config.md) mechanism. |
| 160 | + |
| 161 | +## Rule semantics {#semantics} |
| 162 | + |
| 163 | +### Selectors |
| 164 | + |
| 165 | +Each rule includes an optional `scope` field with a set of selectors that determine which requests the rule applies to. Currently, the supported selectors are: |
| 166 | + |
| 167 | +* `request_types` |
| 168 | + |
| 169 | + Accepts a list of request types. A request matches this selector if its type is in the list. |
| 170 | + |
| 171 | + |
| 172 | +{% cut "Possible values" %} |
| 173 | + |
| 174 | +- KeyValue.CreateVolume |
| 175 | +- KeyValue.DropVolume |
| 176 | +- KeyValue.AlterVolume |
| 177 | +- KeyValue.DescribeVolume |
| 178 | +- KeyValue.ListLocalPartitions |
| 179 | +- KeyValue.AcquireLock |
| 180 | +- KeyValue.ExecuteTransaction |
| 181 | +- KeyValue.Read |
| 182 | +- KeyValue.ReadRange |
| 183 | +- KeyValue.ListRange |
| 184 | +- KeyValue.GetStorageChannelStatus |
| 185 | +- Table.CreateSession |
| 186 | +- Table.KeepAlive |
| 187 | +- Table.AlterTable |
| 188 | +- Table.CreateTable |
| 189 | +- Table.DropTable |
| 190 | +- Table.DescribeTable |
| 191 | +- Table.CopyTable |
| 192 | +- Table.CopyTables |
| 193 | +- Table.RenameTables |
| 194 | +- Table.ExplainDataQuery |
| 195 | +- Table.ExecuteSchemeQuery |
| 196 | +- Table.BeginTransaction |
| 197 | +- Table.DescribeTableOptions |
| 198 | +- Table.DeleteSession |
| 199 | +- Table.CommitTransaction |
| 200 | +- Table.RollbackTransaction |
| 201 | +- Table.PrepareDataQuery |
| 202 | +- Table.ExecuteDataQuery |
| 203 | +- Table.BulkUpsert |
| 204 | +- Table.StreamExecuteScanQuery |
| 205 | +- Table.StreamReadTable |
| 206 | +- Table.ReadRows |
| 207 | +- Query.ExecuteQuery |
| 208 | +- Query.ExecuteScript |
| 209 | +- Query.FetchScriptResults |
| 210 | +- Query.CreateSession |
| 211 | +- Query.DeleteSession |
| 212 | +- Query.AttachSession |
| 213 | +- Query.BeginTransaction |
| 214 | +- Query.CommitTransaction |
| 215 | +- Query.RollbackTransaction |
| 216 | +- Discovery.WhoAmI |
| 217 | +- Discovery.NodeRegistration |
| 218 | +- Discovery.ListEndpoints |
| 219 | + |
| 220 | +{% note info %} |
| 221 | + |
| 222 | +Tracing is supported not only for the request types listed above. This list includes request types that are supported by the `request_types` selector. |
| 223 | + |
| 224 | +{% endnote %} |
| 225 | + |
| 226 | +{% note warning %} |
| 227 | + |
| 228 | +Note that the QueryService API is [experimental](https://github.com/ydb-platform/ydb/blob/e3af273efaef7dfa21205278f17cd164e247820d/ydb/public/api/grpc/ydb_query_v1.proto#L9) and may change in the future. |
| 229 | + |
| 230 | +{% endnote %} |
| 231 | + |
| 232 | +{% endcut %} |
| 233 | + |
| 234 | +* `database` |
| 235 | + |
| 236 | + Filters requests to the specified database. |
| 237 | + |
| 238 | +A request matches a rule if it matches all selectors. `scope` can be absent, which is equivalent to an empty set of selectors, and all requests will fall under this rule. |
| 239 | + |
| 240 | +### Rate limiting |
| 241 | + |
| 242 | +The `max_traces_per_minute` and `max_traces_burst` parameters limit the number of requests. In the case of sampling, they limit the number of requests sampled by this rule. In the case of external throttling, they limit the number of external traces that enter the system. |
| 243 | + |
| 244 | +A variation of the [leaky bucket](https://en.wikipedia.org/wiki/Leaky_bucket) is used for rate limiting with a bucket size equal to `max_traces_burst + 1`. For example, if `max_traces_per_minute = 60` and `max_traces_burst = 0`, then with a flow of 10,000 requests per minute, one request will be traced every second. If `max_traces_burst = 20`, then with the same request flow, the first 21 requests will be traced, and then one request per second will be traced. |
| 245 | + |
| 246 | +{% note warning %} |
| 247 | + |
| 248 | +The limits on the number of traced requests are local to the cluster node. For example, if each cluster node has a rule specifying `max_traces_per_minute = 1`, then no more than one request per minute will be traced **from each cluster node** by this rule. |
| 249 | + |
| 250 | +{% endnote %} |
| 251 | + |
| 252 | +### Detail levels {#tracing-levels} |
| 253 | + |
| 254 | +As with [logs](../../../reference/embedded-ui/logs.md), diagnosing most system issues does not require the most detailed trace. Therefore, in {{ ydb-short-name }}, each span has its own level described by an integer from 0 to 15 inclusive. Each rule in the `sampling` section must include the detail level of the generated trace (`level`); spans with a level less than or equal to `level` will be included in it. |
| 255 | + |
| 256 | +The [{{ ydb-short-name }} architecture](../../../concepts/_includes/index/how_it_works.md#ydb-architecture) section describes the system's division into 5 layers: |
| 257 | + |
| 258 | +| Layer | Components | |
| 259 | +| ---- | --------- | |
| 260 | +| 1 | gRPC Proxies | |
| 261 | +| 2 | Query Processor | |
| 262 | +| 3 | Distributes Transactions | |
| 263 | +| 4 | Tablet, System tablet | |
| 264 | +| 5 | Distributed Storage | |
| 265 | + |
| 266 | +Each layer has seven detail levels: |
| 267 | + |
| 268 | +| Level | Value | |
| 269 | +| ------- | -------- | |
| 270 | +| `Off` | No tracing | |
| 271 | +| `TopLevel` | Lowest detail, no more than two spans per request to the component | |
| 272 | +| `Basic` | Spans of main component operations | |
| 273 | +| `Detailed` | Highest detail applicable for diagnosing problems in production | |
| 274 | +| `Diagnostic` | Detailed debugging information for developers | |
| 275 | +| `Trace` | Very detailed debugging information | |
| 276 | + |
| 277 | +The table below shows the distribution of system layer detail levels by trace detail levels: |
| 278 | + |
| 279 | +| Trace detail level | gRPC Proxies | Query Processor | Distributed Transactions | Tablets | Distributed Storage | |
| 280 | +| ------------------------- | ------------ | --------------- | ------------------------ | ------- | ------------------- | |
| 281 | +| 0 | `TopLevel` | `Off` | `Off` | `Off` | `Off` | |
| 282 | +| 1 | `TopLevel` | **`TopLevel`** | `Off` | `Off` | `Off` | |
| 283 | +| 2 | `TopLevel` | `TopLevel` | **`TopLevel`** | `Off` | `Off` | |
| 284 | +| 3 | `TopLevel` | `TopLevel` | `TopLevel` | **`TopLevel`** | `Off` | |
| 285 | +| 4 | `TopLevel` | `TopLevel` | `TopLevel` | `TopLevel` | **`TopLevel`** | |
| 286 | +| 5 | **`Basic`** | `TopLevel` | `TopLevel` | `TopLevel` | `TopLevel` | |
| 287 | +| 6 | `Basic` | **`Basic`** | `TopLevel` | `TopLevel` | `TopLevel` | |
| 288 | +| 7 | `Basic` | `Basic` | **`Basic`** | `TopLevel` | `TopLevel` | |
| 289 | +| 8 | `Basic` | `Basic` | `Basic` | **`Basic`** | `TopLevel` | |
| 290 | +| 9 | `Basic` | `Basic` | `Basic` | `Basic` | **`Basic`** | |
| 291 | +| 10 | **`Detailed`** | **`Detailed`** | `Basic` | `Basic` | `Basic` | |
| 292 | +| 11 | `Detailed` | `Detailed` | **`Detailed`** | `Basic` | `Basic` | |
| 293 | +| 12 | `Detailed` | `Detailed` | `Detailed` | **`Detailed`** | `Basic` | |
| 294 | +| 13 | `Detailed` | `Detailed` | `Detailed` | `Detailed` | **`Detailed`** | |
| 295 | +| 14 | **`Diagnostic`** | **`Diagnostic`** | **`Diagnostic`** | **`Diagnostic`** | **`Diagnostic`** | |
| 296 | +| 15 | **`Trace`** | **`Trace`** | **`Trace`** | **`Trace`** | **`Trace`** | |
| 297 | + |
| 298 | +### Rules |
| 299 | + |
| 300 | +#### External throttling |
| 301 | + |
| 302 | +The semantics of each rule are as follows: it allocates a quota for the number of requests in this category. For example, if the `external_throttling` section looks like this: |
| 303 | + |
| 304 | +```yaml |
| 305 | +tracing_config: |
| 306 | + external_throttling: |
| 307 | + - max_traces_per_minute: 60 |
| 308 | + - scope: |
| 309 | + request_types: |
| 310 | + - KeyValue.ReadRange |
| 311 | + max_traces_per_minute: 20 |
| 312 | +``` |
| 313 | + |
| 314 | +With a sufficient flow of requests with an external trace-id, at least 60 requests per minute and at least 20 `KeyValue.ReadRange` type requests per minute will be traced. A total of up to 80 requests per minute will be traced. |
| 315 | + |
| 316 | +The algorithm is as follows: for a request with an external trace-id, the rules that apply to this request are determined. The request consumes the quota of all rules that still have it. The request is not traced only if none of the rules have any quota left. |
| 317 | + |
| 318 | +#### Sampling |
| 319 | + |
| 320 | +The semantics of the rule for sampling are similar: with a sufficiently low flow of requests in this category, at least a `fraction` of the requests with at least `level` detail will be sampled. |
| 321 | + |
| 322 | +The algorithm is similar: the set of rules that apply to this request is determined for a request without an external trace-id (either due to its initial absence or due to a previous decision not to trace this request). The request consumes the quota of all rules that still have it and that have randomly "decided" to sample it. It is not sampled if no rule decides to sample the request (all rules that "decided" to sample the request have no quota left). Otherwise, the detail level is determined as the maximum among the rules into whose quota the request fell. |
| 323 | + |
| 324 | +For example, with the following `sampling` configuration: |
| 325 | + |
| 326 | +```yaml |
| 327 | +tracing_config: |
| 328 | + sampling: |
| 329 | + - scope: |
| 330 | + database: /Root/db1 |
| 331 | + fraction: 0.5 |
| 332 | + level: 5 |
| 333 | + max_traces_per_minute: 100 |
| 334 | + - scope: |
| 335 | + database: /Root/db1 |
| 336 | + fraction: 0.01 |
| 337 | + level: 15 |
| 338 | + max_traces_per_minute: 5 |
| 339 | +``` |
| 340 | + |
| 341 | +With a sufficiently low flow of requests to the `/Root/db1` database, the following will be sampled: |
| 342 | + |
| 343 | +* 1% of requests with a detail level of 15 |
| 344 | +* 49.5% of requests with a detail level of 5 |
| 345 | + |
| 346 | +With a sufficiently high flow of requests to the `/Root/db1` database, the following will be sampled: |
| 347 | + |
| 348 | +* 5 requests per minute with a detail level of 15 |
| 349 | +* between 95 and 100 requests per minute with a detail level of 5 |
0 commit comments