Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add _metric_names_hash field to OTel metric mappings #120952

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

felixbarny
Copy link
Member

A short-term workaround for #99123

If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter (see open-telemetry/opentelemetry-collector-contrib#37511). As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.

If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate.
The _metric_names_hash field will be set by the OTel ES exporter.
As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics.
The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created.
That has an impact on the rate aggregation for counters.
@felixbarny felixbarny added >bug :Data Management/Data streams Data streams and their lifecycles auto-backport Automatically create backport pull requests when merged v9.0.0 v8.18.0 v8.17.2 v8.16.4 labels Jan 27, 2025
@felixbarny felixbarny requested a review from a team January 27, 2025 18:28
@felixbarny felixbarny requested a review from a team as a code owner January 27, 2025 18:28
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Jan 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added the external-contributor Pull request authored by a developer outside the Elasticsearch team label Jan 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @felixbarny, I've created a changelog YAML for you.

@@ -14,6 +14,10 @@ template:
type: passthrough
dynamic: true
priority: 10
# workaround for https://github.com/elastic/elasticsearch/issues/99123
_metric_names_hash:
type: keyword
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: will a number be more lightweight, as you're using a 8 digit hex anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, numbers can't leverage run-length encoding. So it's actually lighter to use a keyword here as all dimensions are incorporated into the _tsid, which we sort by. Therefore, all values for the same tsid are equal and can be compressed very efficiently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :Data Management/Data streams Data streams and their lifecycles external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Data Management Meta label for data/management team v8.16.4 v8.17.2 v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants