Single pass SpanFilter #1071

tjwp · 2020-06-08T13:06:06Z

This patch modifies Datadog::Pipeline::SpanFilter to make a single pass through the trace array.

The motivation for this change is that we ran into a situation where a trace was being generated with tens of thousands of spans. The trace was fairly shallow and we were filtering around half the spans (all without children). This became a performance issue as each span that was deleted results in another pass through the remaining array.

We have since modified our tracing to avoid the large traces, but I believe the SpanFilter remains a performance concern.

One case this does not handle is if a child span can appear before a parent in the trace array. The current tests do not cover that scenario. If that is a concern, then the approach below could be modified to make additional passes until the trace array stops changing.

delner · 2020-06-08T14:28:52Z

@tjwp Hmmm, I see. This is a great catch; thank you!

I think it'd be good to show a benchmark for this, before and after, to show the kind of impact it makes. If we can do that, get CI to run green (I'll rerun flaky CI tests for you), and double check the logic, then I'd be happy to merge this.

tjwp · 2020-06-08T16:12:29Z

@delner Here is a very naive benchmark that I ran:

      def test_stress
        n = 500
        trace = []
        n.times do
          trace << generate_span('one')
          trace << generate_span('two')
        end
        filter = SpanFilter.new { |span| span.name[/one/] }
        puts ""
        Benchmark.bm do |x|
          x.report("#{2 * n} spans") { filter.call(trace) }
        end
      end

There are no child spans to deal with. It just creates a long array of traces and removes half of them.

For the single pass approach from this PR:

              user       system     total       real
1000 spans    0.002075   0.000000   0.002075 (  0.002599)
10000 spans   0.034452   0.004451   0.038903 (  0.040961)
20000 spans   0.040434   0.002822   0.043256 (  0.046305)
100000 spans  0.183695   0.063603   0.247298 (  0.260989)

For the previous approach:

              user       system     total       real 
1000 spans    0.080774   0.002686   0.083460 (  0.087096)
10000 spans   5.613146   0.000000   5.613146 (  5.876156)
20000 spans  29.516540   0.149010  29.665550 ( 31.131399)
100000 spans - gave up before it completed

tjwp · 2020-06-22T12:16:47Z

@delner Any update on this? Thanks!

marcotc · 2020-07-08T20:40:57Z

👋 @tjwp, despite the new limitations introduced, I'm inclined to accept these changes as the gains on large traces are significant.

One thing I'd ask you is to write a documentation on the #call about the limitation when a child span comes before a parent on. It is not currently possible for ddtrace to build such traces, but if any requirements change in the future we'll know of the SpanFilter limitation.

tjwp · 2020-07-09T11:50:22Z

Thanks @marcotc! Comment added.

One thing I'd ask you is to write a documentation on the #call about the limitation when a child span comes before a parent on. It is not currently possible for ddtrace to build such traces, but if any requirements change in the future we'll know of the SpanFilter limitation.

marcotc

Thank you again @tjwp! Shipping this change on our next release, tentatively scheduled for next week.

Single pass SpanFilter

f9649a5

tjwp requested a review from a team June 8, 2020 13:06

Remove compare_by_identity

daa6abd

delner added community Was opened by a community member core Involves Datadog core libraries performance Involves performance (e.g. CPU, memory, etc) labels Jun 8, 2020

Add comment on parent spans appearing before child spans

29a5f58

marcotc approved these changes Jul 9, 2020

View reviewed changes

marcotc merged commit 8f575a1 into DataDog:master Jul 9, 2020

marcotc added this to the 0.38.0 milestone Jul 9, 2020

delner mentioned this pull request May 25, 2022

SpanFilter not working after upgrading to 1.0 #2038

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single pass SpanFilter #1071

Single pass SpanFilter #1071

tjwp commented Jun 8, 2020 •

edited

Loading

delner commented Jun 8, 2020

tjwp commented Jun 8, 2020

tjwp commented Jun 22, 2020

marcotc commented Jul 8, 2020

tjwp commented Jul 9, 2020

marcotc left a comment

Single pass SpanFilter #1071

Single pass SpanFilter #1071

Conversation

tjwp commented Jun 8, 2020 • edited Loading

delner commented Jun 8, 2020

tjwp commented Jun 8, 2020

tjwp commented Jun 22, 2020

marcotc commented Jul 8, 2020

tjwp commented Jul 9, 2020

marcotc left a comment

Choose a reason for hiding this comment

tjwp commented Jun 8, 2020 •

edited

Loading