-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/elasticsearch] Use batch sender exporter helper for reliability #32377
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This enhancement makes sense to me, and sounds like a good idea. I'll defer to code owners in case there are component-specific thoughts here. Removing |
Agreed, +1 |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
…batch sender (#34127) **Description:** Refactor the Elasticsearch bulk indexer code to create an abstraction around the existing buffering, asynchronous bulk indexer. This is preparation for supporting two implementations of bulk indexing: the existing asynchronous one, and a new synchronous one that works well with exporterhelper's batch sender -- see #32632. **Link to tracking Issue:** #32377 **Testing:** N/A, this is a non-functional change. **Documentation:** N/A, pure refactoring.
**Description:** Add opt-in support for the experimental batch sender (open-telemetry/opentelemetry-collector#8122). When opting into this functionality the exporter's `Consume*` methods will make synchronous bulk requests to Elasticsearch, without additional batching/buffering in the exporter. By default the exporter continues to use its own buffering, which supports thresholds for time, number of documents, and size of encoded documents in bytes. The batch sender does not currently support a bytes-based threshold, and is experimental, hence why we are not yet making it the default for the Elasticsearch exporter. This PR is based on #32632, but made to be non-breaking. **Link to tracking Issue:** #32377 **Testing:** Added unit and integration tests. Manually tested with Elasticsearch, using the following config to highlight that client metadata can now flow through all the way: ```yaml receivers: otlp: protocols: http: endpoint: 0.0.0.0:4318 include_metadata: true exporters: elasticsearch: endpoint: "http://localhost:9200" auth: authenticator: headers_setter batcher: enabled: false extensions: headers_setter: headers: - action: insert key: Authorization from_context: authorization service: extensions: [headers_setter] pipelines: traces: receivers: [otlp] processors: [] exporters: [elasticsearch] ``` I have Elasticsearch running locally, with an "admin" user with the password "changeme". Sending OTLP/HTTP to the collector with `telemetrygen traces --otlp-http --otlp-insecure http://localhost:4318 --otlp-header "Authorization=\"Basic YWRtaW46Y2hhbmdlbWU=\""`, I observe the following: - Without the `batcher` config, the exporter fails to index data into Elasticsearch due to an auth error. That's because the exporter is buffering and dropping the context with client metadata, so there's no Authorization header attached to the requests going out. - With `batcher: {enabled: true}`, same behaviour as above. Unlike the [`batch` processor](https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/batchprocessor), the batch sender does not maintain client metadata. - With `batcher: {enabled: false}`, the exporter successfully indexes data into Elasticsearch. **Documentation:** Updated the README. --------- Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
Closing as completed via #34238 . Requires opt-in for the new behavior. |
Component(s)
exporter/elasticsearch
Is your feature request related to a problem? Please describe.
This is a way to fix the issue reported in #30792.
In case of a collector crash, the events buffered within the bulk indexer in memory will be lost, even with a persistent queue.
Describe the solution you'd like
Move the buffering from bulk indexer to exporterhelper batch sender introduced in open-telemetry/opentelemetry-collector#8122, such that events are deleted from the queue only when they are flushed by bulk indexer.
Describe alternatives you've considered
No response
Additional context
I'm already working on a draft and related PRs are on their way.
exporterhelper batch sender open-telemetry/opentelemetry-collector#8122 is WIP and interface may change. Some features (e.g. batching by bytes) may not be available yet.
Blockers
The text was updated successfully, but these errors were encountered: