-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add _metric_names_hash field to OTel metric mappings #120952
base: main
Are you sure you want to change the base?
Conversation
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.
Pinging @elastic/es-data-management (Team:Data Management) |
Hi @felixbarny, I've created a changelog YAML for you. |
@@ -14,6 +14,10 @@ template: | |||
type: passthrough | |||
dynamic: true | |||
priority: 10 | |||
# workaround for https://github.com/elastic/elasticsearch/issues/99123 | |||
_metric_names_hash: | |||
type: keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: will a number be more lightweight, as you're using a 8 digit hex anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, numbers can't leverage run-length encoding. So it's actually lighter to use a keyword here as all dimensions are incorporated into the _tsid, which we sort by. Therefore, all values for the same tsid are equal and can be compressed very efficiently.
A short-term workaround for #99123
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The
_metric_names_hash
field will be set by the OTel ES exporter (see open-telemetry/opentelemetry-collector-contrib#37511). As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.