-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opentelemetry collector memory leak/non-optimized GC #26087
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Can you check the metrics emitted by the collector? Perhaps there are more items in the queue during specific events? |
|
one question get raised internally: |
Probably related to open-telemetry/opentelemetry-collector#5966 |
The collector does not auto-updates its config, although the load balancing exporter will update the list of its backends if the DNS or Kubernetes resolvers are used. |
And AFAIK it's not on the roadmap to update config via SIGHUP. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
Component(s)
exporter/loadbalancing
What happened?
Description
We observe opentelemetry collector load balancing layer memory leak which will trigger OOM killed for nomad jobs restart.
The Mem metrics for high load region:
pprof top 10 resutls:
--inuse_space:
--inuse_objects:
The Mem metrics for low load region:
So we are not sure if GC for go is not functioning well or there is an actual mem leak.
Steps to Reproduce
Expected Result
Actual Result
Collector version
0.83.0
Environment information
Environment
nomad + docker
OpenTelemetry Collector configuration
Log output
No response
Additional context
We are using Nomad signal change_mode to re-render template when backend otel collector list get updated. And everytime there is a signaling re-render, there will be a mem jump.
The text was updated successfully, but these errors were encountered: