Skip to content

[bug] Memory leak with long-running bidirectional streams #2668

@Falco20019

Description

@Falco20019

Component

OpenTelemetry.Instrumentation.GrpcCore

Package Version

Package Name Version
OpenTelemetry.Api 1.9.0
OpenTelemetry 1.9.0
OpenTelemetry.Exporter.OpenTelemetryProtocol 1.9.0
OpenTelemetry.Extensions.Hosting 1.9.0
OpenTelemetry.Instrumentation.GrpcCore 1.0.0-beta.6

Runtime Version

net8.0

Description

The ServerTracingInterceptor+ServerRpcScope<TRequest, TResponse> is having a reference to an Activity. For long-running calls like a bi-directional stream for ADS on xDS this means that DiagNode<T> elements will be gathered and kept in memory for the WHOLE lifetime of the application per gRPC channel. In our case, it generated 5 GB of RAM before we had to restart the application.

In case of long-running activities, Microsoft advices to instead use short-lived sub-activities instead and setting subActivity.SetParentId(parentActivity.TraceId, parentActivity.SpanId); to link them (to avoid multi-threading possibly leading to faulty parents). We had to do the same fix on some of our libraries for the same reason.

Steps to Reproduce

  1. Create a long-running connection
  2. Spam thousands of gRPC calls
  3. Check the RAM usage

Expected Result

No memory leak and activities not being held indefinetly

Actual Result

Memory leak due to RpcScope keeping one scope for the whole call's lifecycle without creating child-activities.

Additional Context

We sadly are bound to OpenTelemetry 1.9.0 since 1.10.0 switched Microsoft.Extensions.Diagnostics.Abstractions and Microsoft.Extensions.Logging.Configuration to 9.x as min-versions. According to semantic versioning this would have been a breaking change and should have been reflected by using 2.0.0 instead. Since we follow semver2 and can't break our upstream users, we therefore have to stick with 8.x on net8.0 and only started using the newer versions on net9.0.

See open-telemetry/opentelemetry-dotnet#5967 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcomp:instrumentation.grpccoreThings related to OpenTelemetry.Instrumentation.GrpcCore

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions