-
Notifications
You must be signed in to change notification settings - Fork 520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tracing): Add tracestate
header handling
#1179
Conversation
5f3f8be
to
3877df9
Compare
No behavior changes - just moving stuff around. # Conflicts: # sentry_sdk/tracing.py
…nvelope headers (#971)
Mostly docstrings and comments. Also a stray, no-longer-necessary variable initialization and a few log statements. Pulled from an upcoming PR in order to make it less noisy.
Three small changes here, mostly unrelated. If you're specifically interested in one of them, it's probably easiest to look at them commit by commit in the PR. - Correct return type of `compute_tracestate_entry` to correctly return `None` when we're missing a client or DSN. - Let transactions (effectively) point to themselves as the "containing transaction." In order to avoid a circular reference, this is handled through a new property on both spans and transactions. As a result of this change, we can eliminate a good deal of type-checking that we've been doing up until now. - Switch `tracestate` creation to be lazy. Rather than always creating a value in the transaction constructor, we now wait until we need it - either for an outgoing HTTP header or for the envelope header when sending the transaction event.
) This adds user data (specifically `id` and `segment`) and transaction name to the `tracetate` value. Doing this has two known limitations, which, though we've discussed them, I'm adding here for posterity: 1) Adding this data puts us over the character limit for tracestate values [listed in the W3C spec](https://www.w3.org/TR/trace-context/#value) (256 characters). For reference: Tracestate data: ``` { "trace_id": "12312012123120121231201212312012", "environment": "dogpark", "release": "off.leash.park", "public_key": "dogsarebadatkeepingsecrets", "user": {"id": 12312013, "segment": "bigs"}, "transaction": "/interactions/other-dogs/new-dog/" } ``` `tracestate` with without either: 196 characters `tracestate` with user data: 256 characters `tracestate` with transaction name: 264 characters `tracestate` with both: 324 characters 2) This data may change and/or get added to the scope after the tracestate value has been calculated, making it impossible to sample based on those attributes. This is especially a problem for transaction name, which in some frameworks isn't set to its final value until the transaction ends. This poses the added problem that the transaction name in its raw, un-finalized form may contain PII, because it is often the raw URL as opposed to the parameterized one (so, `/users/maisey/tricks/` rather than `/users/:username/tricks/`). More work needs to be done to investigate whether the final transaction name can be set earlier in any/all of the frameworks where this poses a problem. (For instance, it is a known problem in our Express integration, but not yet clear if it is a problem in any Python frameworks.)
This adds a feature flag, `"propagate_tracestate"`, to `_experiments`, which enables the tracestate header handling behavior. In particular, if the flag is `False` or missing: - `tracestate` headers won't be attached to outgoing HTTP requests, and - `trace` data will not be added to envelope headers.
… garbage collection (#1184) There are a few places in the SDK where we have circular references: 1) Transaction -> span recorder -> spans including transaction itself 2) Child span -> span recorder -> spans including child span itself 3) Transaction -> span recorder -> spans -> containing transaction 4) In `serializer.py`, `_serialize_node()` -> `_serialize_node_impl()` -> `_serialize_node()`, making each a closure for the other. This PR addresses points 1-3*, by making the following changes: 1) Transactions are no longer added to their own span recorders. (The SDK doesn't ever use the fact that they're there, and they're stripped out before the event is sent to Sentry.) 2) Child spans no longer have their own span recorder pointer, and instead access the recorder through their containing transaction. 3) When a transaction ends, after it harvests any completed spans, it now jettisons its link to its span recorder before it (the transaction) goes out of scope. It also adds to/modifies tests covering all three scenarios *Point 4, which we only discovered in the process of fixing 1-3, concerns a different system than the rest, and therefore will need a separate fix.
48cadf1
to
27240ee
Compare
tracestate
header handlingtracestate
header handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feedback: this PR is very hard to review. it appears to consist of already-reviewed PRs exclusively which have been reviewed by a different set of people sometimes. and now I can re-review 1.3k added lines at once
If we're going to release all of those changes at once I think it could be rather risky. It seems to me that we have not ever deployed any part of this into our own SaaS usage. I would suggest to split this PR back up into many PRs, merge those, run a (stable) release for each, deploy each release separately to SaaS. Otherwise I think the risk is too great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rubberstamping. apparently this featurebranch has been deployed to saas already at one point, and there's not many ways to reduce risk afaict, unfortunately
We talked about this offline, but for posterity: This was a feature branch, so yes, it's a bunch of already-reviewed PRs. But we did actually already deploy it, in our back and front ends (see getsentry/sentry#24177 and https://github.com/getsentry/getsentry/pull/5192), at a point when 80-90 percent of the work had already been done.) We can talk process for future work, but in the meantime, I'm going to merge this as agreed. |
This PR introduces handling of the
tracestate
header, as described in the W3C Trace Context spec and our own corresponding spec.Key features:
from_traceparent
in favor ofcontinue_from_headers
, which now propagates both incomingsentry-trace
and incomingtracestate
headers.tracestate
value as a header on outgoing HTTP requests when they're made during a transaction.tracestate
data to transaction envelope headers.Supporting changes:
transaction
/span
circular references before garbage collection #1184.tracing_utils
file.Note:
tracestate
handling is currently feature-gated by the flagpropagate_tracestate
in the_experiments
SDK option.More details can be found in the main PR on this branch, #971.