Significant Memory Increase with OpenTelemetry Leading to OoMKilled Issues on Kubernetes #5461

phthaocse · 2024-06-01T16:25:46Z

Hello,

Our company is currently using the latest version of OpenTelemetry Go 1.27.0. After implementing OpenTelemetry to record metrics, we noticed a significant increase in memory usage in our pods deployed on Kubernetes, leading to OoMKilled issues. Could you please provide us with any documentation or knowledge regarding how OpenTelemetry manages memory?

Thank you.

dmathieu · 2024-06-18T09:38:27Z

This ask is rather vague.
OpenTelemetry does not "manage memory" per-se. Go manages memory.

We do have benchmarks that track allocations though.
They run on new releases, and manually on an as-needed basis in PRs.

Investigating this would require looking into what exactly is using memory within your application.
That may be due to otel (like anything, it does have a memory and cpu footprint). It could also be that you were stretched too thin in term of resources.
Without more information, I'm afraid there isn't much more we can do here.

pellared · 2024-06-18T11:31:59Z

Could you please provide us with any documentation or knowledge regarding how OpenTelemetry manages memory?

I think it would be an overkill. You can always read the codebase.

After implementing OpenTelemetry to record metrics, we noticed a significant increase in memory usage in our pods deployed on Kubernetes, leading to OoMKilled issues.

We cannot do anything without repro steps or profiling data.

yaniv-s · 2024-06-25T14:59:37Z

There's definitely a problem with memory allocations/usage in 1.27
Since I upgraded from 1.24 to 1.27 my service uses more memory, this is from pprof, I hope it can help

MrAlias · 2024-06-25T21:07:11Z

Please provide the example code that you used to generate that graphic. I mean not aware of a function in this project called AddTagToContext. It looks like an inlining to grow is happening there. Understanding of that call sight is needed to begin addressing this.

kellis5137 · 2024-08-01T18:31:17Z

Has anyone found a solution for this issue?

kellis5137 · 2024-08-02T15:09:45Z

Just incase someone runs into this problem. I'm not 100% the exact cause, but the resource limits memory 32 megs. I think it needs to be bumped. I up'ed it and it worked. It took me a while to figure out HOW to bump the autoinstrumentation go sidecar. In your Instrumentation manifest, add a go section under the spec object:

apiVersion: opentelemetry.io/v1alpha1

kind: Instrumentation

metadata:

  name: my-instrumentation

spec:
st incase someone runs into this problem. I'm not 100% the exact cause, but the resource limits memory 32 megs. I think it needs to be bumped. I up'ed it and it worked. It took me a while to figure out HOW to bump the autoinstrumentation go sidecar. In your Instrumentation manifest,  add a go section under the spec object:

apiVersion: opentelemetry.io/v1alpha1

kind: Instrumentation

metadata:

  name: my-instrumentation

spec:

  exporter:

    endpoint: http://otel-collector:4317

  propagators:

    - tracecontext
    - baggage
    - b3

  sampler:

    type: parentbased\_traceidratio

    argument: "0.25"

  go:

    resourceRequirements:

      limits:

        cpu: <up the value if necessary>

        memory: <up the value if necessary> # I upped it to 512Mi (normally 32Mi). Going to monitor and see if I can go down

      requests:

        cpu: 5m # this is the original value as of this writing

        memory: 62Mi # I doubled the amount for the default (normally 32Mi)

    env:

      - name: OTEL\_EXPORTER\_OTLP\_ENDPOINT

        value: http://otel-collector:4318
  exporter:

    endpoint: http://otel-collector:4317

  propagators:

    - tracecontext
    - baggage
    - b3

  sampler:

    type: parentbased\_traceidratio

    argument: "0.25"

  go:

    resourceRequirements:

      limits:

        cpu: <up the value if necessary>

        memory: <up the value if necessary> # I upped it to 512Mi (normally 32Mi). Going to monitor and see if I can go down

      requests:

        cpu: 5m # this is the original value as of this writing

        memory: 62Mi # I doubled the amount for the default (normally 32Mi)

    env:

      - name: OTEL\_EXPORTER\_OTLP\_ENDPOINT

        value: http://otel-collector:4318

phthaocse added the bug Something isn't working label Jun 1, 2024

pellared added invalid This doesn't seem right question Further information is requested labels Jun 18, 2024

MrAlias added the response needed Waiting on user input before progress can be made label Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant Memory Increase with OpenTelemetry Leading to OoMKilled Issues on Kubernetes #5461

Significant Memory Increase with OpenTelemetry Leading to OoMKilled Issues on Kubernetes #5461

phthaocse commented Jun 1, 2024

dmathieu commented Jun 18, 2024

pellared commented Jun 18, 2024

yaniv-s commented Jun 25, 2024

MrAlias commented Jun 25, 2024

kellis5137 commented Aug 1, 2024

kellis5137 commented Aug 2, 2024 •

edited

Loading

Significant Memory Increase with OpenTelemetry Leading to OoMKilled Issues on Kubernetes #5461

Significant Memory Increase with OpenTelemetry Leading to OoMKilled Issues on Kubernetes #5461

Comments

phthaocse commented Jun 1, 2024

dmathieu commented Jun 18, 2024

pellared commented Jun 18, 2024

yaniv-s commented Jun 25, 2024

MrAlias commented Jun 25, 2024

kellis5137 commented Aug 1, 2024

kellis5137 commented Aug 2, 2024 • edited Loading

kellis5137 commented Aug 2, 2024 •

edited

Loading