Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for tail-based sampling #1720

Open
tdinucci opened this issue May 22, 2021 · 0 comments
Open

Support for tail-based sampling #1720

tdinucci opened this issue May 22, 2021 · 0 comments
Labels
area:sampling Related to trace sampling area:sdk Related to the SDK spec:trace Related to the specification/trace directory

Comments

@tdinucci
Copy link

What are you trying to achieve?

As far as I can tell, there is no way to determine by looking at a single span whether any further spans may follow it. If this is the case then it makes it impossible for a collector which receives spans to have confidence that a trace is complete and tail-based sampling is safe to perform.

For short lived traces it’ll normally be fine just to have a window of X seconds and to sample the trace at the end of this window. For longer lived traces though, which potentially cover a large number of services then it may be rare for one of these traces to complete within X seconds - in which case these traces may never be sampled properly.

Additional context.

I wonder if a non-breaking solution to this is to introduce an attribute called something like PropagateCount on Spans which is set whenever a Span is finalised.

If the some Span propagated its context a couple of times then PropagateCount would equal 2.

Obviously there are no guarantees that the receiver of the propagated context goes on to use this. However it will allow for more flexible decision making when tail sampling as there could be a timeout per Span, rather than a timeout per Trace before sampling occurs.

Perhaps an extension to this idea would even be for parent Spans to be able (optionally) to specify expectations which they have of their children, e.g. I expect this child to complete within 10,000ms. This could further enhance the decision making when sampling and potentially drive other interesting workflows.

@tdinucci tdinucci added the spec:trace Related to the specification/trace directory label May 22, 2021
@carlosalberto carlosalberto added area:sampling Related to trace sampling area:sdk Related to the SDK labels May 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:sampling Related to trace sampling area:sdk Related to the SDK spec:trace Related to the specification/trace directory
Projects
None yet
Development

No branches or pull requests

2 participants