-
Notifications
You must be signed in to change notification settings - Fork 178
Initial attempt at OpenTelemetry support #873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Initial attempt at OpenTelemetry support #873
Conversation
| if remote.Transport == nil { | ||
| remote.Transport = http.DefaultTransport | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like NewTransport does this internally, so we can skip it.
| } | ||
|
|
||
| if c.Otel.Tracing.SampleRate == 0 { | ||
| c.Otel.Tracing.SampleRate = 1.0 // Default to 100% sampling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this default risky?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think typically a user from my experience would expect to get all the traces that're being emitted, now if they have too much data, they can either filter the trace spans out in the otel-collector or override this config.
| c.Otel.Tracing.SampleRate, | ||
| ) | ||
| if err != nil { | ||
| log.Printf("WARNING: Failed to initialize OpenTelemetry: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: please use Warning instead of WARNING
|
|
||
| // Initialize OpenTelemetry if enabled | ||
| var otelProvider *otel.TracerProvider | ||
| if c.Otel != nil && c.Otel.Tracing != nil && c.Otel.Tracing.Enabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth setting this to an otelTracingEnabled variable and check that instead, since we do it multiple times.
| // Wrap status handler with OTEL tracing if enabled | ||
| if c.Otel != nil && c.Otel.Tracing != nil && c.Otel.Tracing.Enabled { | ||
| statusHandler = otelhttp.NewHandler( | ||
| http.HandlerFunc(statusHandler), | ||
| "status", | ||
| ).ServeHTTP | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how useful it would be to trace the status page?
| // TLS is usually not typically required if running in a k8s environment | ||
| otlptracegrpc.WithInsecure(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth adding a configuration setting to avoid this.
| Name: "otel.tracing.exporter_endpoint", | ||
| Usage: "The OTLP exporter endpoint for OpenTelemetry traces. Can also be set via OTEL_EXPORTER_OTLP_ENDPOINT env var.", | ||
| DefaultText: "empty, will use OTEL_EXPORTER_OTLP_ENDPOINT env var", | ||
| EnvVars: []string{"BAZEL_REMOTE_OTEL_TRACING_EXPORTER_ENDPOINT"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this code is checking these environment variables with the BAZEL_REMOTE_ prefix. I'm not sure if it's better to use standard variables, or to use custom ones.
| // Add OTEL interceptors first for complete trace coverage | ||
| if c.Otel != nil && c.Otel.Tracing != nil && c.Otel.Tracing.Enabled { | ||
| streamInterceptors = append(streamInterceptors, | ||
| otelgrpc.StreamServerInterceptor()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These deprecated functions should be avoided.
| // Wrap transport with OTEL instrumentation if enabled | ||
| var transport http.RoundTripper = tr | ||
| if otelEnabled { | ||
| transport = otelhttp.NewTransport(tr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not tr = otelhttp.NewTransport(tr) ?
Addresses #684
This PR attempts to add initial support for Opentelemetry using https://github.com/open-telemetry/opentelemetry-go:
We don't have spans for filesystem calls - although we can add them. But I don't think it's needed for that disk/IO metrics can usually help in that situation.
Testing: Some unit tests, I'll share some details once I am able to verify end-to-end after building it here and testing with otel-collector. Below is an example of a trace I generated by running it locally.
for bystream fetch

for remote asset api
