-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[processor/metricstransform] Aggregation skipped when label_set is empty but not nil #34430
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
I'm observing the same problem. I was wondering why the Honeycomb-provided config was reporting 640 distinct CPU events, and it turns out that all the aggregations are just being ignored. I resolved the problem by downgrading to Version 0.105.0 - now those 640 events are aggregated down to a single one. I'm guessing that #33655 introduced this bug, seeing as it replaced the event aggregation functions with a completely different code path. |
A potential hack: Latest observation if v0.106 is using a no_op_label instead of Looking at https://github.com/open-telemetry/opentelemetry-collector/blob/main/pdata/pcommon/slice.go#L125 if a given element in label_set does not exist in the original slice, |
…erTools#1369)" (GoogleContainerTools#1376) Due to a refactor in the upstream, the metric transform processor no longer takes `label_set:[]` as a hint to remove all labels, instead all labels fall through and get exported. See [issue](open-telemetry/opentelemetry-collector-contrib#34430) for details. This change is to reapply the upgrade to 0.106.0 to catch up with latest otelcollector but adding the following workaround: A label `no_op_label` is given to the `label_set` of the metrics that need all labels removed. In this case the [re-slice function in the core lib](https://github.com/open-telemetry/opentelemetry-collector/blob/main/pdata/pcommon/slice.go#L125) will filter out all other valid labels of the metric. As for the no_op_label nothing will be performed as it's not part of the original attributes. This reverts commit 1d33f46.
This change updates otelcoontribcol to latest and modifies to fit breaking changes from 0.104.0 and 0.106.0. Breaking change in 0.104.0 https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/tag/v0.104.0 Breaking change in 0.106.0 open-telemetry/opentelemetry-collector-contrib#34430 - Localhost is now the default setting, while otel-agent and otel-collector require 0.0.0.0, so the feature gate has been removed. - The format of the environment variable was updated to meet the new syntax requirements. The otel-agent ConfigMap was split between the reconciler and controllers, ensuring that sync-related labels are only applied to reconcilers. - A `no_op_label` has been added to ensure that the aggregation in the metricstransform processor filters on all metric labels. This is a temporary workaround until a permanent fix is implemented upstream.
**Description:** <Describe what has changed.> It is a valid use case to aggregate against an empty label set, which will functionally clear all attributes. This behaviour was removed in #33655, which simplified the check to `len() == 0`, which covers the case of the label set being `nil` and having 0 elements as the same scenario. However, these are not the same scenario and have different meanings. This PR reintroduces the original behaviour, but in a more efficient way by recognizing a label set with 0 elements and clearing the attributes, which would be the logical conclusion after running the filter anyway. **Link to tracking Issue:** #34430 **Testing:** <Describe what testing was performed and which tests were added.> **Documentation:** <Describe the documentation added.> --------- Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This is expected to be resolved as of #35006. |
This change updates otelcoontribcol to latest and modifies to fit breaking changes from 0.104.0 and 0.106.0. Breaking change in 0.104.0 https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/tag/v0.104.0 Breaking change in 0.106.0 open-telemetry/opentelemetry-collector-contrib#34430 - Localhost is now the default setting, while otel-agent and otel-collector require 0.0.0.0, so the feature gate has been removed. - The format of the environment variable was updated to meet the new syntax requirements. The otel-agent ConfigMap was split between the reconciler and controllers, ensuring that sync-related labels are only applied to reconcilers. - A `no_op_label` has been added to ensure that the aggregation in the metricstransform processor filters on all metric labels. This is a temporary workaround until a permanent fix is implemented upstream.
This change updates otelcoontribcol to latest and modifies to fit breaking changes from 0.104.0 and 0.106.0. Breaking change in 0.104.0 https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/tag/v0.104.0 Breaking change in 0.106.0 open-telemetry/opentelemetry-collector-contrib#34430 - Localhost is now the default setting, while otel-agent and otel-collector require 0.0.0.0, so the feature gate has been removed. - The format of the environment variable was updated to meet the new syntax requirements. The otel-agent ConfigMap was split between the reconciler and controllers, ensuring that sync-related labels are only applied to reconcilers. - A `no_op_label` has been added to ensure that the aggregation in the metricstransform processor filters on all metric labels. This is a temporary workaround until a permanent fix is implemented upstream.
…metry#35006) **Description:** <Describe what has changed.> It is a valid use case to aggregate against an empty label set, which will functionally clear all attributes. This behaviour was removed in open-telemetry#33655, which simplified the check to `len() == 0`, which covers the case of the label set being `nil` and having 0 elements as the same scenario. However, these are not the same scenario and have different meanings. This PR reintroduces the original behaviour, but in a more efficient way by recognizing a label set with 0 elements and clearing the attributes, which would be the logical conclusion after running the filter anyway. **Link to tracking Issue:** open-telemetry#34430 **Testing:** <Describe what testing was performed and which tests were added.> **Documentation:** <Describe the documentation added.> --------- Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
Component(s)
processor/metricstransform
What happened?
Description
We are upgrading from 0.102.0 to 0.106.0 for CVE fixes and encountered a behavior change in the mentioned processor. With the configuration we have the expectation is no label remains for the metric after aggregation operation, but now all labels fall through and get exported.
When looking at the code previous filter skips if label_set is nil and refactored code skips if length is 0.
I did not find the change of this behavior in the release note, was it intentional? If yes, would omitting this field work? I've tried explicitly setting label_set to nil and the filter is still not working as expected.
Steps to Reproduce
Install otelconribcol 0.102.0 with our ConfigMap, everything works fine;
Upgrade to 0.106.0, the metric that has
labe_set:[]
starts to getting rejected by our back end as unrecognized labels received.Expected Result
With
label_set:[]
all labels associated with a metric are filtered out.Actual Result
With
label_set:[]
all labels associated with a metric remain.Collector version
0.106.0
Environment information
Environment
GKE cluster 1.29.6-gke.1254000
OpenTelemetry Collector configuration
Log output
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: