Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTLPHTTP exporter ignores response code and content-type, retries and logs anyway #8263

Closed
breedx-splk opened this issue Aug 23, 2023 · 3 comments · Fixed by #8270
Closed
Labels
bug Something isn't working

Comments

@breedx-splk
Copy link

There exists a regression/bug in 0.81.0 (and 82 and 83) of the otel-collector-contrib
with the otlphttp exporter. This bug is initially observed when debug logging for
trace export is enabled:

{"kind": "exporter", "data_type": "traces", "name": "logging/debug"}
2023-08-22T15:36:59.784-0700	info	exporterhelper/queued_retry.go:423	Exporting failed. Will retry the request after interval.	{"kind": "exporter", "data_type": "traces", "name": "otlphttp", "error": "unexpected EOF", "interval": "7.214643193s"}

The collector has sent a successful otlphttp payload to a backend and received an HTTP 200
success response, but fails to parse the return body and thus unnecessarily re-queues
the data for retransmission.

The result is both a confusing error message to the user and a duplicate sending
of payload data to the backend. This wastes resources and could potentially cause
other larger problems for backends.

It is believed that the root cause was introduced when the partial success
behavior was added in 0.81.0. This assumes that a response payload from the backend
always contains protobuf, which is not necessarily true. In this repro case,
the response is clearly json.

Steps to reproduce

See the details in this repro repo: https://github.com/breedx-splk/collector_otlphttp_eof

What did you expect to see?

Successful exports should not log an error and definitely should not retry, regardless of the content-type of the response.

What did you see instead?

An error and the collector retries, even tho data was exported correctly.

What version did you use?

Verified as problematic on 0.81.0, 0.82.0, and 0.83.0

What config did you use?

See above repro repo for config.

Environment

Tested on MacOS.

Additional context

https://github.com/breedx-splk/collector_otlphttp_eof

@breedx-splk breedx-splk added the bug Something isn't working label Aug 23, 2023
@breedx-splk breedx-splk changed the title OTLPHTTP exporter OTLPHTTP exporter ignores response code and content-type, retries and logs Aug 23, 2023
@breedx-splk breedx-splk changed the title OTLPHTTP exporter ignores response code and content-type, retries and logs OTLPHTTP exporter ignores response code and content-type, retries and logs anyway Aug 23, 2023
@breedx-splk
Copy link
Author

This PR #6970 seems to ignore the response Content-type and simply assume protobuf response regardless. When this happens, other responses (text, json, whatever) are unparseable and the collector logs an error (at debug at least) and queues the data for retry.

dmitryax pushed a commit that referenced this issue Aug 24, 2023
…8270)

**Description:**
Fix the handling of the HTTP response to ignore responses not encoded as
protobuf

**Link to tracking Issue:**
Fixes #8263
@tigrannajaryan
Copy link
Member

@atoulme is it possible to also improve the error message when the response is really not parseable?

@atoulme
Copy link
Contributor

atoulme commented Aug 24, 2023

@atoulme is it possible to also improve the error message when the response is really not parseable?

Please see https://github.com/open-telemetry/opentelemetry-collector/pull/8283/files thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants