Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unregister receiver metrics when receiver is stopped/removed #10223

Open
newly12 opened this issue May 27, 2024 · 3 comments
Open

unregister receiver metrics when receiver is stopped/removed #10223

newly12 opened this issue May 27, 2024 · 3 comments
Labels
area:component collector-telemetry healthchecker and other telemetry collection issues

Comments

@newly12
Copy link
Contributor

newly12 commented May 27, 2024

Is your feature request related to a problem? Please describe.

we have an in-house receiver that dynamically reloads the configs to start/stop pipelines on the fly without restarting the otel collector, we noticed when a pipeline(receivers and processors) is stopped, the receiver metrics still remain from the metrics page, such as otelcol_receiver_accepted_metric_points, otelcol_receiver_refused_metric_points for a metrics receiver, which leads to the metrics endpoint size keeps increasing as well as the number of total metrics.

Describe the solution you'd like

when receiver is stopped, its related metrics should be removed as well.

Describe alternatives you've considered

Additional context

@TylerHelmuth TylerHelmuth added collector-telemetry healthchecker and other telemetry collection issues area:component labels May 29, 2024
@mx-psi mx-psi added this to the Self observability milestone May 29, 2024
@vjsamuel
Copy link
Contributor

This got exasperated because of https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/prometheusreceiver/metrics_receiver.go#L263 which expects the receiver names to be unique, this causes the endpoint to infinitely grow. The version of prometheus that expects this behavior is 2.50+

@newly12
Copy link
Contributor Author

newly12 commented Jul 12, 2024

Hi team, could we have some update on this one? It is been a blocking issue for us to upgrade our metrics otel collectors version, also the other logs otel collector faced the same issue, given pods on the same k8s node are changing from time to time, receivers/pipelines have to be brought up and shutdown all the time and this leaves the metrics endpoint to growing pretty fast, and waste of memory of keeping "stale" metrics..

@newly12
Copy link
Contributor Author

newly12 commented Jul 12, 2024

I found open-telemetry/opentelemetry-specification#3062, it appears otel metrics sdk does not support removal of certain metrics at this moment.

If it is not possible to support metrics auto removal of stopped components shortly, does it make sense to disable metrics of certain component kinds(receiver/processor/exporter) through a new feature gate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:component collector-telemetry healthchecker and other telemetry collection issues
Projects
None yet
Development

No branches or pull requests

4 participants