-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add otel tracing to authorino #380
Conversation
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
What should my next steps be ? |
This is a very good start, @Rohith-Raju. Thank you again for putting this together! I've left a few comments. After that, I imagine your next move would be instrumenting the Line 230 in 02c2929
|
HI @guicassolato, I don't see them here. |
My bad @Rohith-Raju. The review was pending in GH. |
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
PR looking great, @Rohith-Raju! Do you mind writing down a few verification steps in the description, so people can try this out manually and see how it looks like? It usually starts with Thanks! |
Verification steps I used to try this out locally: make cluster install build
docker run --rm --name jaeger -d \
-e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
-p 5775:5775/udp \
-p 6831:6831/udp \
-p 6832:6832/udp \
-p 5778:5778 \
-p 16686:16686 \
-p 14268:14268 \
-p 9411:9411 \
jaegertracing/all-in-one:1.6
bin/authorino server --observability-service-endpoint http://localhost:14268/api/traces Then, in a separate shell: curl http://localhost:5001/check
kubectl apply -f -<<EOF
apiVersion: authorino.kuadrant.io/v1beta1
kind: AuthConfig
metadata:
name: my-authconfig
spec:
hosts:
- localhost
EOF
curl http://localhost:5001/check Navigate to |
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
Signed-off-by: ROHITH RAJU <rohithraju488@gmail.com>
main.go
Outdated
otel_grpc "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc" | ||
otel_http "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp" | ||
otel_propagation "go.opentelemetry.io/otel/propagation" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind separating this block? We try to keep the imports separated in at least 3 blocks:
- Go standard library packages
github.com/kuadrant/authorino/*
packages- Third-party packages
Hi @Rohith-Raju. Thanks again for putting this work together. I finally had time to finish reviewing the PR today. I'll leave a few suggestions here since I cannot push to your branch: diff --git a/main.go b/main.go
index 93df9c8..dfa82b0 100644
--- a/main.go
+++ b/main.go
@@ -37,16 +37,12 @@ import (
"github.com/kuadrant/authorino/pkg/service"
"github.com/kuadrant/authorino/pkg/trace"
"github.com/kuadrant/authorino/pkg/utils"
- otel_grpc "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
- otel_http "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
- otel_propagation "go.opentelemetry.io/otel/propagation"
envoy_auth "github.com/envoyproxy/go-control-plane/envoy/service/auth/v3"
grpc_prometheus "github.com/grpc-ecosystem/go-grpc-prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"github.com/spf13/cobra"
"github.com/spf13/pflag"
- "go.opentelemetry.io/otel"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials"
healthpb "google.golang.org/grpc/health/grpc_health_v1"
@@ -56,6 +52,11 @@ import (
_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/healthz"
+
+ otel_grpc "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
+ otel_http "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
+ "go.opentelemetry.io/otel"
+ otel_propagation "go.opentelemetry.io/otel/propagation"
// +kubebuilder:scaffold:imports
)
@@ -89,7 +90,7 @@ var (
enableLeaderElection bool
maxHttpRequestBodySize int64
observabilityServiceEndpoint string
- observabilityServiveSeed string
+ observabilityServiceSeed string
scheme = runtime.NewScheme()
@@ -134,6 +135,7 @@ func main() {
cmdServer.PersistentFlags().BoolVar(&enableLeaderElection, "enable-leader-election", false, "Enable leader election for status updater - ensures only one instance of Authorino tries to update the status of reconciled resources")
cmdServer.PersistentFlags().Int64Var(&maxHttpRequestBodySize, "max-http-request-body-size", utils.EnvVar("MAX_HTTP_REQUEST_BODY_SIZE", int64(8192)), "Maximum size of the body of requests accepted in the raw HTTP interface of the authorization server - in bytes")
cmdServer.PersistentFlags().StringVar(&observabilityServiceEndpoint, "observability-service-endpoint", "", "Endpoint URL of the OpenTelemetry collector service")
+ cmdServer.PersistentFlags().StringVar(&observabilityServiceSeed, "observability-service-seed", "", "Seed attribute of the OpenTelemetry resource")
cmdVersion := &cobra.Command{
Use: "version",
@@ -180,7 +182,7 @@ func run(cmd *cobra.Command, _ []string) {
if observabilityServiceEndpoint != "" {
otel.SetLogger(logger)
- tp, err := trace.CreateTraceProvider(observabilityServiceEndpoint, version)
+ tp, err := trace.CreateTraceProvider(observabilityServiceEndpoint, version, observabilityServiceSeed)
if err != nil {
logger.Error(err, "unable to create traceprovider")
os.Exit(1)
diff --git a/pkg/context/context.go b/pkg/context/context.go
index 3438844..062fb53 100644
--- a/pkg/context/context.go
+++ b/pkg/context/context.go
@@ -21,18 +21,18 @@ type options struct {
timeout time.Duration
}
-type option func(options)
+type option func(*options)
// WithParent returns an option to create a new context based on an existing parent context.
func WithParent(parent gocontext.Context) option {
- return func(opts options) {
+ return func(opts *options) {
opts.parent = parent
}
}
// WithTimeout return an option to create a new context that cancels itself automatically after the timeout.
func WithTimeout(timeout time.Duration) option {
- return func(opts options) {
+ return func(opts *options) {
opts.timeout = timeout
}
}
@@ -41,15 +41,16 @@ func WithTimeout(timeout time.Duration) option {
// If a parent context is provided, creates a copy of the parent with further options.
// If a timeout option is provided, creates a context that cancels itself automatically after the timeout.
func New(opts ...option) gocontext.Context {
- o := options{}
+ o := &options{}
for _, opt := range opts {
opt(o)
}
- ctx := gocontext.Background()
-
+ var ctx gocontext.Context
if o.parent != nil {
ctx = o.parent
+ } else {
+ ctx = gocontext.Background()
}
if o.timeout > 0 {
diff --git a/pkg/service/auth.go b/pkg/service/auth.go
index 01b7f4e..229191a 100644
--- a/pkg/service/auth.go
+++ b/pkg/service/auth.go
@@ -11,11 +11,6 @@ import (
"strings"
"time"
- "go.opentelemetry.io/otel"
- otel_attr "go.opentelemetry.io/otel/attribute"
- otel_codes "go.opentelemetry.io/otel/codes"
- otel_propagation "go.opentelemetry.io/otel/propagation"
- otel_trace "go.opentelemetry.io/otel/trace"
gocontext "golang.org/x/net/context"
"github.com/kuadrant/authorino/pkg/auth"
@@ -32,6 +27,11 @@ import (
"google.golang.org/protobuf/types/known/structpb"
v1 "k8s.io/api/admission/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+
+ "go.opentelemetry.io/otel"
+ otel_attr "go.opentelemetry.io/otel/attribute"
+ otel_codes "go.opentelemetry.io/otel/codes"
+ otel_trace "go.opentelemetry.io/otel/trace"
)
const (
@@ -88,14 +88,10 @@ func NewAuthService(index index.Index, timeout time.Duration, maxHttpRequestBody
// The body can be any JSON object; in case the input is a Kubernetes AdmissionReview resource,
// the response is compatible with the Dynamic Admission API
func (a *AuthService) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
- ctx := context.New(context.WithTimeout(a.Timeout))
- ctx = otel.GetTextMapPropagator().Extract(ctx, otel_propagation.HeaderCarrier(req.Header))
requestId := fmt.Sprintf("%x", sha256.Sum256([]byte(fmt.Sprint(req))))
- attr := otel_trace.WithAttributes(otel_attr.String("authorino.request_id", requestId))
- ctx, span := otel.Tracer("AuthService").Start(ctx, "ServeHTTP", attr)
-
- otel.GetTextMapPropagator().Inject(ctx, otel_propagation.HeaderCarrier(resp.Header()))
+ ctx := context.New(context.WithParent(req.Context()), context.WithTimeout(a.Timeout))
+ ctx, span := otel.Tracer("AuthService").Start(ctx, "ServeHTTP", otel_trace.WithAttributes(otel_attr.String("authorino.request_id", requestId)))
defer span.End()
logger := log.
@@ -241,18 +237,17 @@ func (a *AuthService) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
// Check performs authorization check based on the attributes associated with the incoming request,
// and returns status `OK` or not `OK`.
func (a *AuthService) Check(parentContext gocontext.Context, req *envoy_auth.CheckRequest) (*envoy_auth.CheckResponse, error) {
- requestId := req.Attributes.Request.Http.GetId()
- attr := otel_trace.WithAttributes(otel_attr.String("authorino.request_id", requestId))
- _, span := otel.Tracer("AuthService").Start(parentContext, "Check", attr)
+ requestData := req.Attributes.Request.Http
+ requestId := requestData.GetId()
+
+ ctx, span := otel.Tracer("AuthService").Start(parentContext, "Check", otel_trace.WithAttributes(otel_attr.String("authorino.request_id", requestId)))
defer span.End()
requestLogger := log.WithName("service").WithName("auth").WithValues("request id", requestId)
- ctx := log.IntoContext(context.New(context.WithParent(parentContext), context.WithTimeout(a.Timeout)), requestLogger)
+ ctx = log.IntoContext(context.New(context.WithParent(ctx), context.WithTimeout(a.Timeout)), requestLogger)
a.logAuthRequest(req, ctx)
- requestData := req.Attributes.Request.Http
-
// service config
var host string
if h, overridden := req.Attributes.ContextExtensions[X_LOOKUP_KEY_NAME]; overridden {
diff --git a/pkg/trace/exporter.go b/pkg/trace/exporter.go
index 3a41ac6..13a18e0 100644
--- a/pkg/trace/exporter.go
+++ b/pkg/trace/exporter.go
@@ -1,6 +1,7 @@
package trace
import (
+ "go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/exporters/jaeger"
"go.opentelemetry.io/otel/sdk/resource"
"go.opentelemetry.io/otel/sdk/trace"
@@ -14,26 +15,27 @@ func newExporter(url string) (*jaeger.Exporter, error) {
return jaeger.New(collector)
}
-func newResource(version string) *resource.Resource {
- r, _ := resource.Merge(
- resource.Default(),
- resource.NewWithAttributes(
- semconv.SchemaURL,
- semconv.ServiceNameKey.String("authorino"),
- semconv.ServiceVersionKey.String(version),
- ),
- )
+func newResource(version, seed string) *resource.Resource {
+ attrs := []attribute.KeyValue{
+ semconv.ServiceNameKey.String("authorino"),
+ semconv.ServiceVersionKey.String(version),
+ }
+ if seed != "" {
+ attrs = append(attrs, attribute.String("seed", seed))
+ }
+ res := resource.NewWithAttributes(semconv.SchemaURL, attrs...)
+ r, _ := resource.Merge(resource.Default(), res)
return r
}
-func CreateTraceProvider(url string, version string) (*trace.TracerProvider, error) {
+func CreateTraceProvider(url, version, seed string) (*trace.TracerProvider, error) {
exp, err := newExporter(url)
if err != nil {
return nil, err
}
tp := trace.NewTracerProvider(
trace.WithBatcher(exp),
- trace.WithResource(newResource(version)),
+ trace.WithResource(newResource(version, seed)),
)
return tp, nil
} The most important changes are:
To test these changesBuild and start the server: make cluster install build && bin/authorino server --observability-service-endpoint=http://localhost:14268/api/traces --observability-service-seed=$(date +%s) --log-level=debug The value of the Send requests to the Raw HTTP Authorization interface: curl -H "Traceparent: 00-$(openssl rand -hex 16)-7359b90f4355cfd9-01" http://localhost:5001/check -v
curl -H "Traceparent: 00-$(openssl rand -hex 16)-7359b90f4355cfd9-01" http://localhost:5001/check -X PUT -v
curl http://localhost:5001/check -v Requests should show up in the Jaeger UI properly nested: |
In case one wants to try the changes by hitting the default GRPC interface instead of the Raw HTTP Authorization one:
for repo in \
envoyproxy/data-plane-api \
googleapis/googleapis \
envoyproxy/protoc-gen-validate \
cncf/udpa
do
git clone git@github.com:$repo /tmp/authorino-grpc-client/$repo
done
brew install grpcurl
grpcurl -import-path /tmp/authorino-grpc-client/envoyproxy/data-plane-api \
-import-path /tmp/authorino-grpc-client/googleapis/googleapis \
-import-path /tmp/authorino-grpc-client/envoyproxy/protoc-gen-validate \
-import-path /tmp/authorino-grpc-client/cncf/udpa \
-proto /tmp/authorino-grpc-client/envoyproxy/data-plane-api/envoy/service/auth/v3/external_auth.proto \
-plaintext -d "{\"attributes\":{\"request\":{\"http\":{\"host\":\"localhost\",\"headers\":{\"traceparent\":\"00-$(openssl rand -hex 16)-7359b90f4355cfd9-01\"}}}}}" \
localhost:50051 \
envoy.service.auth.v3.Authorization.Check
grpcurl -import-path /tmp/authorino-grpc-client/envoyproxy/data-plane-api \
-import-path /tmp/authorino-grpc-client/googleapis/googleapis \
-import-path /tmp/authorino-grpc-client/envoyproxy/protoc-gen-validate \
-import-path /tmp/authorino-grpc-client/cncf/udpa \
-proto /tmp/authorino-grpc-client/envoyproxy/data-plane-api/envoy/service/auth/v3/external_auth.proto \
-plaintext -d '{"attributes":{"request":{"http":{"host":"localhost"}}}}' \
localhost:50051 \
envoy.service.auth.v3.Authorization.Check Authorino must be running locally and Jaeger in a docker as instructions shared before. The traces in the Jaeger UI in this case should look like this: |
Thanks for the feedback @guicassolato, I'll make the changes and get back 😄 |
Signed-off-by: Rohith-Raju <rohithraju488@gmail.com>
Signed-off-by: Rohith-Raju <rohithraju488@gmail.com>
Signed-off-by: Rohith-Raju <rohithraju488@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some automated unit (and perhaps also integration) tests needed, plus docs, but I thing those can come in a subsequent iteration, given the amount of manual testing already performed here.
I've left a few comments for other future improvements as well.
semconv.ServiceVersionKey.String(version), | ||
} | ||
if seed != "" { | ||
attrs = append(attrs, attribute.String("seed", seed)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking again about this. Perhaps it could expect a comma-separated key=value list instead. As long as it's in the same release, it can be done in a separate iteration.
os.Exit(1) | ||
} | ||
otel.SetTracerProvider(tp) | ||
otel.SetTextMapPropagator(otel_propagation.NewCompositeTextMapPropagator(otel_propagation.TraceContext{}, otel_propagation.Baggage{})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth enabling propagation even when observability reporting is off? Right now, it's not injecting trace IDs or baggage into outgoing requests (e.g. HTTP requests to external sources of metadata), but it'll be useful once added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to ask you the same, thanks for covering this 👍
cmdServer.PersistentFlags().StringVar(&observabilityServiceEndpoint, "observability-service-endpoint", "", "Endpoint URL of the OpenTelemetry collector service") | ||
cmdServer.PersistentFlags().StringVar(&observabilityServiceSeed, "observability-service-seed", "", "Seed attribute of the OpenTelemetry resource") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need support by the Authorino Operator.
cmdServer.PersistentFlags().StringVar(&observabilityServiceEndpoint, "observability-service-endpoint", "", "Endpoint URL of the OpenTelemetry collector service") | ||
cmdServer.PersistentFlags().StringVar(&observabilityServiceSeed, "observability-service-seed", "", "Seed attribute of the OpenTelemetry resource") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need documentation at https://github.com/Kuadrant/authorino/blob/main/docs/user-guides/observability.md#tracing.
return r | ||
} | ||
|
||
func CreateTraceProvider(url, version, seed string) (*trace.TracerProvider, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing unit tests.
Should tracing be opt-in/opt-out feature? Some "switch" to disable tracing might be interesting? |
Question: this is only about tracing or it also exposes new metrics in the metric endpoint ? |
For now, it's only about tracing, @eguzki. Authorino does expose endpoints for collectors to pull metrics (Prometheus format). In the future, we can extend the OpenTelemetry instrumentation added here to push the metrics to the observability service endpoint as well. |
Amazing job @Rohith-Raju. Thank you! |
Hello, This pr is #163
Tasks Done
Verification steps courtesy @guicassolato :
Then, in a separate shell:
Navigate to http://localhost:16686 to check the traces in the Jaeger UI.
Signed-off-by: ROHITH RAJU rohithraju488@gmail.com