-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/elasticsearch] deprecate/remove dedup config #33773
Labels
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This was referenced Jun 26, 2024
andrzej-stencel
pushed a commit
that referenced
this issue
Jun 28, 2024
**Description:** Deprecate `exporter/elasticsearch`'s "dedup" configuration. In a future release we will remove this configuration, and always deduplicate wherever necessary. **Link to tracking Issue:** #33773 **Testing:** N/A **Documentation:** Updated the README.
andrzej-stencel
added a commit
that referenced
this issue
Jul 16, 2024
**Description:** Remove the `dedup` configuration setting, and always de-duplicate. Elasticsearch does not permit duplicate keys in JSON objects, and this configuration is adding more complexity to the code than it's worth. I've simplified the `internal/objmodel` API slightly, unexporting the `Sort` methods, which are internally called by the now unconditional call to `Dedup`. **Link to tracking Issue:** Closes #33773 **Testing:** Ran the unit tests, which cover deduplication. None of the tests in package elasticsearchexporter covered `dedup: false`. **Documentation:** <Describe the documentation added.> --------- Co-authored-by: Carson Ip <carsonip@users.noreply.github.com> Co-authored-by: Andrzej Stencel <andrzej.stencel@elastic.co>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Component(s)
exporter/elasticsearch
Describe the issue you're reporting
The
dedup
config was introduced in the initial implementation of the Elasticsearch exporter for de-duplicating colliding attributes. This is particularly relevant when using the "Raw" or "ECS encoding modes, where attributes are not nested in unique namespaces. e.g. if one were to add atrace.id
attribute to a log record, while also setting the top-level TraceID field, then in ECS mode this would lead to duplicatetrace.id
attributes being set in the Elasticsearch document; enabling thededup
configuration prevents this.For better or worse, Elasticsearch rejects documents with objects that have duplicate keys: elastic/elasticsearch#19614. This is not configurable.
The only conceivable reason for making deduplication configurable in the exporter is when you know for absolutely sure that duplicates cannot occur, so you can save a few CPU cycles. This would be the exception rather than the norm, and I think the value does not justify the complexity the configuration introduces.
I propose we:
dedup
configI don't think there's any harm in doing these out of order, essentially making the config a no-op.
The text was updated successfully, but these errors were encountered: