Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single pass SpanFilter #1071

Merged
merged 3 commits into from
Jul 9, 2020
Merged

Single pass SpanFilter #1071

merged 3 commits into from
Jul 9, 2020

Conversation

tjwp
Copy link
Contributor

@tjwp tjwp commented Jun 8, 2020

This patch modifies Datadog::Pipeline::SpanFilter to make a single pass through the trace array.

The motivation for this change is that we ran into a situation where a trace was being generated with tens of thousands of spans. The trace was fairly shallow and we were filtering around half the spans (all without children). This became a performance issue as each span that was deleted results in another pass through the remaining array.

We have since modified our tracing to avoid the large traces, but I believe the SpanFilter remains a performance concern.

One case this does not handle is if a child span can appear before a parent in the trace array. The current tests do not cover that scenario. If that is a concern, then the approach below could be modified to make additional passes until the trace array stops changing.

@tjwp tjwp requested a review from a team June 8, 2020 13:06
@delner
Copy link
Contributor

delner commented Jun 8, 2020

@tjwp Hmmm, I see. This is a great catch; thank you!

I think it'd be good to show a benchmark for this, before and after, to show the kind of impact it makes. If we can do that, get CI to run green (I'll rerun flaky CI tests for you), and double check the logic, then I'd be happy to merge this.

@delner delner added community Was opened by a community member core Involves Datadog core libraries performance Involves performance (e.g. CPU, memory, etc) labels Jun 8, 2020
@tjwp
Copy link
Contributor Author

tjwp commented Jun 8, 2020

@delner Here is a very naive benchmark that I ran:

      def test_stress
        n = 500
        trace = []
        n.times do
          trace << generate_span('one')
          trace << generate_span('two')
        end
        filter = SpanFilter.new { |span| span.name[/one/] }
        puts ""
        Benchmark.bm do |x|
          x.report("#{2 * n} spans") { filter.call(trace) }
        end
      end

There are no child spans to deal with. It just creates a long array of traces and removes half of them.

For the single pass approach from this PR:

              user       system     total       real
1000 spans    0.002075   0.000000   0.002075 (  0.002599)
10000 spans   0.034452   0.004451   0.038903 (  0.040961)
20000 spans   0.040434   0.002822   0.043256 (  0.046305)
100000 spans  0.183695   0.063603   0.247298 (  0.260989)

For the previous approach:

              user       system     total       real 
1000 spans    0.080774   0.002686   0.083460 (  0.087096)
10000 spans   5.613146   0.000000   5.613146 (  5.876156)
20000 spans  29.516540   0.149010  29.665550 ( 31.131399)
100000 spans - gave up before it completed

@tjwp
Copy link
Contributor Author

tjwp commented Jun 22, 2020

@delner Any update on this? Thanks!

@marcotc
Copy link
Member

marcotc commented Jul 8, 2020

👋 @tjwp, despite the new limitations introduced, I'm inclined to accept these changes as the gains on large traces are significant.

One thing I'd ask you is to write a documentation on the #call about the limitation when a child span comes before a parent on. It is not currently possible for ddtrace to build such traces, but if any requirements change in the future we'll know of the SpanFilter limitation.

@tjwp
Copy link
Contributor Author

tjwp commented Jul 9, 2020

Thanks @marcotc! Comment added.

One thing I'd ask you is to write a documentation on the #call about the limitation when a child span comes before a parent on. It is not currently possible for ddtrace to build such traces, but if any requirements change in the future we'll know of the SpanFilter limitation.

Copy link
Member

@marcotc marcotc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again @tjwp! Shipping this change on our next release, tentatively scheduled for next week.

@marcotc marcotc merged commit 8f575a1 into DataDog:master Jul 9, 2020
@marcotc marcotc added this to the 0.38.0 milestone Jul 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Was opened by a community member core Involves Datadog core libraries performance Involves performance (e.g. CPU, memory, etc)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants