Skip to content

Commit

Permalink
[exporter/loadbalancingexporter] amend README for data loss scenarios…
Browse files Browse the repository at this point in the history
… [chore] (open-telemetry#31271)

**Description:** amend the loadbalancingexporter README to explain how
topology changes can influence decisions around retry configuration, and
how they can result in data loss.

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

**Link to tracking Issue:** 31209

**Testing:** N/A

**Documentation:** README.ms
  • Loading branch information
alexchowle authored and XinRanZhAWS committed Mar 13, 2024
1 parent 0fea916 commit 5014d08
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions exporter/loadbalancingexporter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,12 @@ This should be stable enough for most cases, and the larger the number of backen

This also supports service name based exporting for traces. If you have two or more collectors that collect traces and then use spanmetrics processor to generate metrics and push to prometheus, there is a high chance of facing label collisions on prometheus if the routing is based on `traceID` because every collector sees the `service+operation` label. With service name based routing, each collector can only see one service name and can push metrics without any label collisions.

## Resilience and scaling considerations
The `loadbalancingexporter` will, irrespective of the chosen resolver (`static`, `dns`, `k8s`), create one exporter per endpoint. The exporter conforms to its published configuration regarding sending queue and retry mechanisms. Importantly, the `loadbalancingexporter` will not attempt to re-route data to a healthy endpoint on delivery failure, and data loss is therefore possible if the exporter's target remains unavailable once redelivery is exhausted. Due consideration needs to be given to the exporter queue and retry configuration when running in a highly elastic environment.

- When using the `static` resolver and a target is unavailable, all the target's load-balanced telemetry will fail to be delivered until either the target is restored or removed from the static list. The same principle applies to the `dns` resolver.
- When using `k8s`, `dns`, and likely future resolvers, topology changes are eventually reflected in the `loadbalancingexporter`. The `k8s` resolver will update more quickly than `dns`, but a window of time in which the true topology doesn't match the view of the `loadbalancingexporter` remains.

## Configuration

Refer to [config.yaml](./testdata/config.yaml) for detailed examples on using the processor.
Expand Down

0 comments on commit 5014d08

Please sign in to comment.