Skip to content

stats.Handler's HandleRPC is called with invalid context #6928

Closed as not planned
@mx-psi

Description

@mx-psi

What version of gRPC are you using?

This happens with v1.60.1 but not with v1.59.0

What version of Go are you using (go version)?

v1.21.0

What operating system (Linux, Windows, …) and version?

Happens at least on Amazon Linux, but I don't think this is specific to given operating system.

What did you do?

The opentelemetry-go-contrib project has some instrumentation for grpc-go. This project defines a stats.Handler. We use this in the OpenTelemetry Collector and have seen reports of crashes on its latest version, see open-telemetry/opentelemetry-collector/issues/9296.

We unfortunately don't have a minimal example to reproduce this at this time.

What did you expect to see?

No crashes :) I would expect the context value set on the TagRPC call to always be recoverable on the HandleRPC call.

What did you see instead?

The context does not have this value. See details on open-telemetry/opentelemetry-collector/issues/9296, the crash trace is:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1235929]

goroutine 167 [running]:
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.(*config).handleRPC(0xc002a0e9c0, {0x902ac80, 0xc002aaa510}, {0x8ff2fd0?, 0xc0020fba88?})
	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc@v0.46.1/stats_handler.go:144 +0xa9
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.(*serverHandler).HandleRPC(0xc00232a340?, {0x902ac80?, 0xc002aaa510?}, {0x8ff2fd0?, 0xc0020fba88?})
	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc@v0.46.1/stats_handler.go:88 +0x2a
google.golang.org/grpc/internal/transport.(*http2Server).WriteStatus(0xc00232a340, 0xc002a0c480, 0xc002b3c340)
	google.golang.org/grpc@v1.60.1/internal/transport/http2_server.go:1071 +0xaf2
google.golang.org/grpc.(*Server).handleStream(0xc002858c00, {0x905cfe0, 0xc00232a340}, 0xc002a0c480)
	google.golang.org/grpc@v1.60.1/server.go:1749 +0x575
google.golang.org/grpc.(*Server).serveStreams.func2.1()
	google.golang.org/grpc@v1.60.1/server.go:1016 +0x59
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 166
	google.golang.org/grpc@v1.60.1/server.go:1027 +0x115

Additional details

I think this is a bug because of the comment here:

if ri == nil {
// Shouldn't happen because TagRPC populates this information.
return
}

The opentelemetry-go-contrib maintainers also think this is a bug. I filed open-telemetry/opentelemetry-go-contrib/pull/4825 to make the code protected against this, but still this seems like something worth looking into in grpc-go.

I did a first pass to try and narrow down what change would have caused this, my guess it that it would be #6716 or, less likely #6750. Maybe @zasweq cam help here?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions