Skip to content

Tracing is now very broken in Azure Functions #2733

Open
@martinjt

Description

@martinjt

Description

I'm not entirely sure what's changed, but I suspect this may have been a host change, but I'll list out what I'm saying based on multiple customer issues we've had.

There is now an empty Activity present in the execution path. This means that whenever a user adds tracing, this new Activity is the parent, however, it's not something that you can listen to since it does not have a source name.

This leads to the new ActivitySource named Microsoft.Azure.Functions.Worker, this does take over the Blank Activity when you listen to the source, however, it uses the TraceContext from the FunctionContext as it's parent, which doesn't use the inbound traceparent from the HttpRequest headers as it's parent. This means that your span has a parent which you're unable to access.

image
That is what a trace looks like right now with the OOTB setup.

        services.AddOpenTelemetry()
            .ConfigureResource(resource => resource.AddService("practical-otel-azure-functions"))
            .UseFunctionsWorkerDefaults()
            .WithTracing(tracingBuilder =>
            {
                tracingBuilder
                    .SetSampler<AlwaysOnSampler>()
                    .AddSource("Microsoft.Azure.Functions.Worker")
                    .AddHttpClientInstrumentation()
                    .AddAspNetCoreInstrumentation()
                    .AddOtlpExporter();
            });

Aspire is rendering it in a semi-ok way, that's compensating for the fact that the data itself on those spans isn't right (missing parents, and missing root spans).

I think the big change is that, previously, there was not TraceContext on the function context, and therefore there was no active Activity. Since the upgrade, this has changed and now, there is an Activity which has a remote parent that there is no way to get access to (there's supposed to be a way to access it in Flex Consumption).

To be clear, this is a breaking change. Adding an Activity that is not trackable, and overriding the inbound traceparent causes functions to be untraceable due to the context propagation issues.

The AspNetCore span completely baffles me, since it's got a different parent to the FunctionsWorker span, and also somehow is outside of the hierarchy.

I'd love to know what caused this, and whether there is a plan to fix functions tracing for users who aren't upgrading to Flex Consumption.

Steps to reproduce

Create a new HTTP Trigger function
Add OpenTelemetry
Send the data to somewhere you can see the raw data.
Run it locally
Hit the endpoint

Look at the telemetry and you'll see that it doesn't follow a linear hierarchy, and there is no span that doesn't have a parent (aka a root span)

Metadata

Metadata

Assignees

Labels

Needs: Attention 👋potential-bugItems opened using the bug report template, not yet triaged and confirmed as a bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions