-
Notifications
You must be signed in to change notification settings - Fork 358
Description
Component
OpenTelemetry.Instrumentation.GrpcCore
Package Version
| Package Name | Version |
|---|---|
| OpenTelemetry.Api | 1.9.0 |
| OpenTelemetry | 1.9.0 |
| OpenTelemetry.Exporter.OpenTelemetryProtocol | 1.9.0 |
| OpenTelemetry.Extensions.Hosting | 1.9.0 |
| OpenTelemetry.Instrumentation.GrpcCore | 1.0.0-beta.6 |
Runtime Version
net8.0
Description
The ServerTracingInterceptor+ServerRpcScope<TRequest, TResponse> is having a reference to an Activity. For long-running calls like a bi-directional stream for ADS on xDS this means that DiagNode<T> elements will be gathered and kept in memory for the WHOLE lifetime of the application per gRPC channel. In our case, it generated 5 GB of RAM before we had to restart the application.
In case of long-running activities, Microsoft advices to instead use short-lived sub-activities instead and setting subActivity.SetParentId(parentActivity.TraceId, parentActivity.SpanId); to link them (to avoid multi-threading possibly leading to faulty parents). We had to do the same fix on some of our libraries for the same reason.
Steps to Reproduce
- Create a long-running connection
- Spam thousands of gRPC calls
- Check the RAM usage
Expected Result
No memory leak and activities not being held indefinetly
Actual Result
Memory leak due to RpcScope keeping one scope for the whole call's lifecycle without creating child-activities.
Additional Context
We sadly are bound to OpenTelemetry 1.9.0 since 1.10.0 switched Microsoft.Extensions.Diagnostics.Abstractions and Microsoft.Extensions.Logging.Configuration to 9.x as min-versions. According to semantic versioning this would have been a breaking change and should have been reflected by using 2.0.0 instead. Since we follow semver2 and can't break our upstream users, we therefore have to stick with 8.x on net8.0 and only started using the newer versions on net9.0.