Skip to content

fix(node): Fix baggage propagation #11363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 2, 2024
Merged

fix(node): Fix baggage propagation #11363

merged 3 commits into from
Apr 2, 2024

Conversation

mydea
Copy link
Member

@mydea mydea commented Apr 2, 2024

The node integration tests (correctly!) showed that we were not propagating baggage 100% correct anymore. Turns out there were a few issues with what we did before:

Do not freeze DSC onto span traceState when not continuing incoming trace

The main issue was, that we did not actually set the transaction, sampled and sample_rate baggage entries. This was the case because we only picked these from the trace state, but we only set this fully in our own startSpan() APIs, not when auto instrumentation created spans. So I changed this to not set the trace state in startSpan() APIs, but instead leave it empty unless we are actually continuing an incoming trace. Whenever we are not continuing an incoming trace, we'll now generate the DSC at injection-time based of the actual span data.

So to summarize, we now set the dsc trace state in two places:

  1. In the propagator, when we extract an incoming trace.
  2. In our trace functions, when we use forceTransaction which we treat as continuing an incoming trace.

No merging of existing baggage headers

Another issue that came up is that we did not properly handled other baggage entires being set. This is because of how OTEL does injection, where it passes the carrier as object which we treated as unknown and basically ignored. This meant that if a user did e.g. fetch({ headers: { baggage: 'my=value' }}), we would overwrite the baggage and not keep the exiting value(s). This PR fixes this as well by ensuring we keep this around.

Future improvements

After talking with @lforst, we may be able to improve/streamline this further by getting rid of the DSC trace state.
The logic could be:

  1. If there is a DSC on the scope's propagation context, use that.
  2. Else, use the root span

We may update to this logic internally in a follow up step!

@mydea mydea self-assigned this Apr 2, 2024
@mydea mydea requested review from lforst, Lms24, AbhiPrasad and s1gr1d April 2, 2024 09:57
Copy link

codecov bot commented Apr 2, 2024

Bundle Report

Changes will decrease total bundle size by 3.2MB ⬇️

Bundle name Size Change
@sentry/vercel-edge-cjs 18.23kB 18.23kB ⬆️
@sentry/vercel-edge-esm 16.13kB 0 bytes
@sentry/nextjs-cjs 20.52kB 0 bytes
@sentry/nextjs-esm 20.02kB 0 bytes
@sentry/types-cjs (removed) 35 bytes ⬇️
@sentry/types-esm (removed) 35 bytes ⬇️
@sentry/node-esm (removed) 333.56kB ⬇️
@sentry-internal/replay-esm (removed) 306.46kB ⬇️
@sentry-internal/node-integration-tests-cjs (removed) 1.04kB ⬇️
@sentry/node-cjs (removed) 336.97kB ⬇️
@sentry/opentelemetry-cjs (removed) 68.45kB ⬇️
@sentry/utils-cjs (removed) 178.75kB ⬇️
@sentry/bun-cjs (removed) 13.5kB ⬇️
@sentry-internal/replay-canvas-cjs (removed) 29.51kB ⬇️
@sentry-internal/replay-cjs (removed) 306.35kB ⬇️
@sentry/wasm-cjs (removed) 5.2kB ⬇️
@sentry-internal/feedback-esm (removed) 65.5kB ⬇️
@sentry/bun-esm (removed) 10.05kB ⬇️
@sentry/astro-cjs (removed) 27.13kB ⬇️
@sentry/aws-serverless-cjs (removed) 14.62kB ⬇️
@sentry/react-esm (removed) 41.18kB ⬇️
@sentry/utils-esm (removed) 174.17kB ⬇️
@sentry-internal/tracing-cjs (removed) 108.01kB ⬇️
@sentry/svelte-cjs (removed) 13.84kB ⬇️
@sentry-internal/replay-canvas-esm (removed) 29.43kB ⬇️
@sentry/sveltekit-cjs (removed) 69.31kB ⬇️
@sentry/vue-cjs (removed) 20.19kB ⬇️
@sentry/google-cloud-serverless-cjs (removed) 23.0kB ⬇️
@sentry/sveltekit-esm (removed) 61.08kB ⬇️
@sentry/wasm-esm (removed) 4.85kB ⬇️
@sentry/remix-cjs (removed) 53.62kB ⬇️
@sentry/astro-esm (removed) 23.39kB ⬇️
@sentry/remix-esm (removed) 48.23kB ⬇️
@sentry/opentelemetry-esm (removed) 67.4kB ⬇️
@sentry/gatsby-esm (removed) 385 bytes ⬇️
@sentry/gatsby-cjs (removed) 905 bytes ⬇️
@sentry-internal/node-integration-tests-esm (removed) 888 bytes ⬇️
@sentry/vue-esm (removed) 18.85kB ⬇️
@sentry-internal/tracing-esm (removed) 107.26kB ⬇️
@sentry/profiling-node-cjs (removed) 25.5kB ⬇️
@sentry/profiling-node-esm (removed) 25.52kB ⬇️
@sentry/browser-cjs (removed) 107.36kB ⬇️
@sentry/core-cjs (removed) 240.44kB ⬇️
@sentry/google-cloud-serverless-esm (removed) 19.16kB ⬇️
@sentry/core-esm (removed) 236.82kB ⬇️
@sentry-internal/integration-shims-cjs (removed) 3.65kB ⬇️

// TODO(v8): Fix this test
// eslint-disable-next-line jest/no-disabled-tests
test.skip('Should populate Sentry and ignore 3rd party content if sentry-trace header does not exist', async () => {
test('Should populate Sentry and ignore 3rd party content if sentry-trace header does not exist', async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I'm missunderstanding the test but looking at the assertions below, shouldn't we also check for the presence of foo=bar,bar=baz?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think this is fine; We should only care about not overwriting baggage entries that were already added to an outgoing request before we inject our entries. The foo and bar entries were not added further; when the outgoing request was made, right?

Comment on lines 370 to 379
/**
* Will parse a baggage header into an array of tuples,
* where the first item is the baggage name, the second the baggage value.
*/
function parseBaggageHeaderString(baggageHeader: string): [string, string][] {
return baggageHeader.split(',').map(baggageEntry => {
const [key, value] = baggageEntry.split('=');
return [decodeURIComponent(key.trim()), decodeURIComponent(value.trim())];
});
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l: Is there a reason we need to parse this into an array of tuples? just wondering if it makes sense to unify things here with baggageHeaderToObject from @sentry/utils?

Feel free to disregard if N/A :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense!

@@ -227,6 +225,8 @@ function getContext(scope: Scope | undefined, forceTransaction: boolean | undefi
const { spanId, traceId } = parentSpan.spanContext();
const sampled = getSamplingDecision(parentSpan.spanContext());

// In this case, when we are forcing a transaction, we want to treat this like continuing an incoming trace
Copy link
Member

@Lms24 Lms24 Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m: just to understand this correctly: Here we're populating a DSC at span creation time - right? Do we still update the DSC's transaction property when the name changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not allowed to update the transaction name in that case because it's a transaction boundary and would break dynamic sampling. Or do you mean, when the parent transaciton's name updates?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case yes, we would not update this anymore - BUT it is very likely that by this time we already have updated the span correctly, so it should be fine in 99% of cases (as this will be "frozen" at the time when the child span will be created, by which time the parent span has hopefully already "stabilized") - does this make sense?

Copy link
Contributor

@lforst lforst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, we should probably not use the trace state but instead use the TracePropagationContext as single source of truth, but otherwise this looks good to me.

Copy link
Contributor

github-actions bot commented Apr 2, 2024

size-limit report 📦

Path Size
@sentry/browser (incl. Tracing, Replay, Feedback) 80.38 KB (-0.11% 🔽)
@sentry/browser (incl. Tracing, Replay) 71.73 KB (-0.09% 🔽)
@sentry/browser (incl. Tracing, Replay with Canvas) 75.55 KB (-0.09% 🔽)
@sentry/browser (incl. Tracing, Replay) - with treeshaking flags 65.3 KB (-0.09% 🔽)
@sentry/browser (incl. Tracing) 36.57 KB (-0.13% 🔽)
@sentry/browser (incl. browserTracingIntegration) 36.57 KB (-0.13% 🔽)
@sentry/browser (incl. feedbackIntegration) 31.53 KB (0%)
@sentry/browser (incl. feedbackModalIntegration) 31.63 KB (0%)
@sentry/browser (incl. feedbackScreenshotIntegration) 31.64 KB (0%)
@sentry/browser (incl. sendFeedback) 27.59 KB (0%)
@sentry/browser 22.75 KB (0%)
CDN Bundle (incl. Tracing, Replay, Feedback) 74.93 KB (+0.02% 🔺)
CDN Bundle (incl. Tracing, Replay) 69.72 KB (-0.02% 🔽)
CDN Bundle (incl. Tracing) 36.26 KB (-0.05% 🔽)
CDN Bundle 24.01 KB (-0.04% 🔽)
CDN Bundle (incl. Tracing, Replay) - uncompressed 218.82 KB (+0.01% 🔺)
CDN Bundle (incl. Tracing) - uncompressed 109.47 KB (+0.01% 🔺)
CDN Bundle - uncompressed 71.09 KB (+0.02% 🔺)
@sentry/react (incl. Tracing, Replay) 71.71 KB (-0.11% 🔽)
@sentry/react 22.78 KB (0%)

Copy link
Member

@Lms24 Lms24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx for explaining!

@mydea mydea force-pushed the fn/more-node-tests branch from 4451405 to 5237667 Compare April 2, 2024 11:59
@mydea mydea merged commit 2e0c776 into develop Apr 2, 2024
@mydea mydea deleted the fn/more-node-tests branch April 2, 2024 13:17
mydea added a commit that referenced this pull request Apr 2, 2024
This depends on
#11363 being merged
first...
cadesalaberry pushed a commit to cadesalaberry/sentry-javascript that referenced this pull request Apr 19, 2024
The node integration tests (correctly!) showed that we were not
propagating baggage 100% correct anymore. Turns out there were a few
issues with what we did before:

## Do not freeze DSC onto span `traceState` when not continuing incoming
trace

The main issue was, that we did not actually set the `transaction`,
`sampled` and `sample_rate` baggage entries. This was the case because
we only picked these from the trace state, but we only set this fully in
our own `startSpan()` APIs, not when auto instrumentation created spans.
So I changed this to _not_ set the trace state in `startSpan()` APIs,
but instead leave it empty _unless_ we are actually continuing an
incoming trace. Whenever we are _not_ continuing an incoming trace,
we'll now generate the DSC at injection-time based of the actual span
data.

So to summarize, we now set the `dsc` trace state in two places:

1. In the propagator, when we `extract` an incoming trace.
2. In our trace functions, when we use `forceTransaction` which we treat
as continuing an incoming trace.

## No merging of existing baggage headers

Another issue that came up is that we did not properly handled other
`baggage` entires being set. This is because of how OTEL does injection,
where it passes the `carrier` as object which we treated as `unknown`
and basically ignored. This meant that if a user did e.g. `fetch({
headers: { baggage: 'my=value' }})`, we would overwrite the baggage and
not keep the exiting value(s). This PR fixes this as well by ensuring we
keep this around.

## Future improvements

After talking with @lforst, we may be able to improve/streamline this
further by getting rid of the DSC trace state.
The logic could be:

1. If there is a DSC on the scope's propagation context, use that.
2. Else, use the root span

We may update to this logic internally in a follow up step!
cadesalaberry pushed a commit to cadesalaberry/sentry-javascript that referenced this pull request Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants