-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add throttling / backpressure and correct error handling to OTLP receiver #669
Comments
@pjanotti can you please add details here about how exactly receivers are informed about backpressure? |
This is a nice-to-have, removing from 1.0. |
We also need to test this for OTLP receiver's HTTP transport and ensure we return the correct backpressure code according to OTLP/HTTP spec. Preferably these all should have automated tests. |
Is this specific to OTLP, or does it apply to other push-based receivers too? |
Briefly checked the source code and what we need to do is check that the error returned from the consumer is PermanentError (which can happen if connected directly to exporter) and return a BadData to the sender. In all other cases we should return a "backoff" response to the sender (real-life use cases are either errors from memory limiter or errors from queued_retry when the queue is fill). If the error is throttleRetry error we can also extract and use the retry interval from throttleRetry. All others receivers need to behave similarly. Not sure if we can implement some sort of helper (likely doable at least for all http-based receivers or for all grpc-based receivers). |
…try#669) * Reimplement histogram using mutex instead of stateLocker Move existing implementation to histogram_statelocker.go. Implement benchmarks for single thread and parallel histogram updates comparing mutex version to stateLocker version * Drop statelocker implementation and alignment tests, benchmarks Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>
Fixes open-telemetry#657 With the changes in open-telemetry#667 and open-telemetry#669 to use a plain-old-mutex for concurrent access of Histogram and MinMaxSumCount aggregators, the StateLocker implementation is no longer used in the project. Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>
Related or duplicate of #4335 ? |
OTLP must throttle the clients when it receives a backpressure from the pipeline (e.g. memory_limiter).
Must comply with OTLP spec and use gRPC status codes + RetryInfo (see the spec).
Also currently does not seem to correctly handle non-permanent errors returned by next consumer. We return 500, which according to OTLP spec means client will not retry. This means we will lose data.
Instead for retryable errors we must return 503 and for permanent errors must return 400 or similar.
The text was updated successfully, but these errors were encountered: