-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature to split tailsampling into two phases pre and post sample #30319
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
correct me if i am wrong, the first tail sampling processor will no sample any trace, but tag them as sampled / not sampled. So that span metrics connector could attach exemplars of those sampled trace to metrics right? I understand that it will benefit the metrics, but I feel the tail sampling processor should not be further enhanced to support it. And it takes 2x cpu / memory (cmiiw) to handle. What if we just add some policy in span metrics connector to spot important traces? |
Pre-sample phase will do the actual policy execution which determines and tags sampled traces.
It will increase CPU usage but not 2x because post-sample will not run policies. What it does is that it will just iterate over spans in trace and drop or keep it depending if its tagged or not.
This approach could increase CPU usage 2x because in order for it to make sense we would need to duplicate same logic in metrics processor. Worst case duplicate code and 2x CPU usage. Plus there is a chance that two processors won't come to same conclusion whether it should keep or drop the span. |
Thank you @tiithansen . So the pre-sample phase is actually a dry-run phase right? // The following comments are just personal point of view. We need maintainer's support here.
So all we need to modify on tail-sampling processor (If this proposal is accepted) is a dry-run config. And keep most of thing works as before. But it's just my thought and I would like to have more input for both issue author and maintainers. Thanks for your contribution again @tiithansen |
Yes, it pretty much is a dry-run in that sense. Using some other more light weight processor to do the actual dropping might make sense because tail sampling has a lot of going on inside. It batches traces together etc... But for sake of simplicity I implemented it in tail sampling for now. There is a PR open with initial implementation already. Feel free to have a look how its implemented. |
I would love to know which policy sampled the trace (in presample). Perhaps the policy name would make a good default attribute value? The existence suggests |
I agree with @jiekun, we should use existing components to filter out traces without the sample attribute. Focusing the implementation on the dry-run. Not yet sure how I feel about "presample" and "dry-run", I suspect there's a better name of this mode of operation 🤔 |
Adding policy name which triggered sampling should be fairly easy. But for debugging it would make sense if we could add whole decision tree to the attribute value. For example if we have nested and policies then it does not give any value if we just see root Regarding "presample" and "dry-run". Maybe we could just have two types of behaviors in |
@tiithansen excellent point and great suggestions. Is there "prior art" with another component having this kinds of mode switch? Wondering if we start to establish a convention, or we go it alone. |
FYI Honeycomb Refinery refers to "dry run mode" https://github.com/honeycombio/refinery#dry-run-mode |
To me this I will update the pull request soon with changes what have been discussed here. |
@portertech @jiekun I have updated the PR. Tailsampling readme contains a full example also how to use tailsampling, filterprocessor an spanmetrics together. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
Component(s)
connector/spanmetrics, processor/tailsampling
Is your feature request related to a problem? Please describe.
When generating spanmetrics with exemplars its not possible to specify which traces are going to be stored by tailsampling processor and which not. Currently all traces passing through spanmetrics connector will have exemplars added to them even if traces are not actually stored. This causes frustration for developers trying to debug spikes on graphs.
Describe the solution you'd like
Split tailsampling into two phases:
tail_sampling.sample: true
. Presample phase will pass all traces to next stagestail_sampling.sample: true
set.Between these two stages it is possible to run spanmetrics connector which would check
tail_sampling.sample
attribute to determine if it should export exemplars or not.Sample pipeline:
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: