Description
What version of gRPC are you using?
This happens with v1.60.1 but not with v1.59.0
What version of Go are you using (go version
)?
v1.21.0
What operating system (Linux, Windows, …) and version?
Happens at least on Amazon Linux, but I don't think this is specific to given operating system.
What did you do?
The opentelemetry-go-contrib project has some instrumentation for grpc-go. This project defines a stats.Handler
. We use this in the OpenTelemetry Collector and have seen reports of crashes on its latest version, see open-telemetry/opentelemetry-collector/issues/9296.
We unfortunately don't have a minimal example to reproduce this at this time.
What did you expect to see?
No crashes :) I would expect the context value set on the TagRPC
call to always be recoverable on the HandleRPC
call.
What did you see instead?
The context does not have this value. See details on open-telemetry/opentelemetry-collector/issues/9296, the crash trace is:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1235929]
goroutine 167 [running]:
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.(*config).handleRPC(0xc002a0e9c0, {0x902ac80, 0xc002aaa510}, {0x8ff2fd0?, 0xc0020fba88?})
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc@v0.46.1/stats_handler.go:144 +0xa9
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.(*serverHandler).HandleRPC(0xc00232a340?, {0x902ac80?, 0xc002aaa510?}, {0x8ff2fd0?, 0xc0020fba88?})
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc@v0.46.1/stats_handler.go:88 +0x2a
google.golang.org/grpc/internal/transport.(*http2Server).WriteStatus(0xc00232a340, 0xc002a0c480, 0xc002b3c340)
google.golang.org/grpc@v1.60.1/internal/transport/http2_server.go:1071 +0xaf2
google.golang.org/grpc.(*Server).handleStream(0xc002858c00, {0x905cfe0, 0xc00232a340}, 0xc002a0c480)
google.golang.org/grpc@v1.60.1/server.go:1749 +0x575
google.golang.org/grpc.(*Server).serveStreams.func2.1()
google.golang.org/grpc@v1.60.1/server.go:1016 +0x59
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 166
google.golang.org/grpc@v1.60.1/server.go:1027 +0x115
Additional details
I think this is a bug because of the comment here:
grpc-go/stats/opencensus/opencensus.go
Lines 204 to 207 in ddd377f
The opentelemetry-go-contrib maintainers also think this is a bug. I filed open-telemetry/opentelemetry-go-contrib/pull/4825 to make the code protected against this, but still this seems like something worth looking into in grpc-go.
I did a first pass to try and narrow down what change would have caused this, my guess it that it would be #6716 or, less likely #6750. Maybe @zasweq cam help here?