-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single pass SpanFilter #1071
Single pass SpanFilter #1071
Conversation
@tjwp Hmmm, I see. This is a great catch; thank you! I think it'd be good to show a benchmark for this, before and after, to show the kind of impact it makes. If we can do that, get CI to run green (I'll rerun flaky CI tests for you), and double check the logic, then I'd be happy to merge this. |
@delner Here is a very naive benchmark that I ran: def test_stress
n = 500
trace = []
n.times do
trace << generate_span('one')
trace << generate_span('two')
end
filter = SpanFilter.new { |span| span.name[/one/] }
puts ""
Benchmark.bm do |x|
x.report("#{2 * n} spans") { filter.call(trace) }
end
end There are no child spans to deal with. It just creates a long array of traces and removes half of them. For the single pass approach from this PR:
For the previous approach:
|
@delner Any update on this? Thanks! |
👋 @tjwp, despite the new limitations introduced, I'm inclined to accept these changes as the gains on large traces are significant. One thing I'd ask you is to write a documentation on the |
Thanks @marcotc! Comment added.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you again @tjwp! Shipping this change on our next release, tentatively scheduled for next week.
This patch modifies
Datadog::Pipeline::SpanFilter
to make a single pass through the trace array.The motivation for this change is that we ran into a situation where a trace was being generated with tens of thousands of spans. The trace was fairly shallow and we were filtering around half the spans (all without children). This became a performance issue as each span that was deleted results in another pass through the remaining array.
We have since modified our tracing to avoid the large traces, but I believe the SpanFilter remains a performance concern.
One case this does not handle is if a child span can appear before a parent in the trace array. The current tests do not cover that scenario. If that is a concern, then the approach below could be modified to make additional passes until the trace array stops changing.