-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use most recent event for trace resource name #1695
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall. I left a minor suggestion.
Co-authored-by: Marco Costa <marco.costa@datadoghq.com>
Hey! One important thing to notice is that this approach is not equivalent to the previous one. In particular, the description states:
This assumption can be broken if the profiler is not sampling "often enough". This is because for integrations where the resource name is set at the end, there's only a small-ish window of opportunity between the resouce name being set and the span finishing, and the profiler MUST sample during that time, or otherwise it will miss getting the information. The previously implemented approach covers this gap, because it only requires that profile encoding occurs after the resource name being set, which is a much larger window of opportunity. Here's an experiment that shows this happening for an example application: Example application: require 'bundler/inline'
gemfile do
source 'https://rubygems.org'
gem 'google-protobuf'
gem 'ddtrace', path: '.'
gem 'rack'
gem 'open-uri'
gem 'dogstatsd-ruby'
end
Datadog.configure do |c|
c.use :rack
end
def foo_request
URI.open("http://127.0.0.1:8080/foo").read
end
def start_web_server
print "Starting http server"
Thread.new do
Rack::Handler::WEBrick.run(
Datadog::Contrib::Rack::TraceMiddleware.new(proc { |env| [200, {"Content-Type" => "text/plain"}, ["#{sleep 2}\n"]]}),
Port: 8080,
AccessLog: [],
)
end
while true
begin
foo_request
puts "\nWeb server started!"
return
rescue Errno::ECONNREFUSED
print '.'
end
end
end
start_web_server
puts "Running request loop..."
loop do
foo_request
end I additionally modified diff --git a/lib/ddtrace/profiling/pprof/stack_sample.rb b/lib/ddtrace/profiling/pprof/stack_sample.rb
index 1d61bf34..fd6f8986 100644
--- a/lib/ddtrace/profiling/pprof/stack_sample.rb
+++ b/lib/ddtrace/profiling/pprof/stack_sample.rb
@@ -124,16 +124,18 @@ module Datadog
).trace_resource
if trace_resource && !trace_resource.empty?
labels << Perftools::Profiles::Label.new(
key: builder.string_table.fetch(Datadog::Ext::Profiling::Pprof::LABEL_KEY_TRACE_ENDPOINT),
str: builder.string_table.fetch(trace_resource)
)
end
+
+ puts "For #{span_id} => #{trace_resource.inspect}"
end
labels
end
def debug_statistics
"unique stacks: #{@processed_unique_stacks}, of which had active traces: #{@processed_with_trace_ids}"
end Here's how it looks with this branch applied:
(I left it running for a while, and did not see a single resource that was non-nil) ...and here's how it looks it master:
The two Having said that, there's an additional consideration I believe is relevant. As discussed in this comment, right now we have only 5 integrations that generate spans of type And after I spent some time on it, out of the 5 integrations, only rack does not set the correct resource at the beginning of the request. Which is why that's the one I picked for my experiment above. So if all we're doing is breaking rack, and rack is not very interesting anyway, perhaps we could just not care that we are not able to get this data for rack. I guess our decision needs to take that into account. @delner @marcotc let me know your thoughts. |
I think the point above about sampling is fair. However the additional consideration is the important one:
We have plans to change tracing rules such that you can't mutate the resource of the span that isn't active. This is because mutating spans (changing history) is incompatible with much of the distributed/async tracing we want to make possible in the tracer. This mutation what happens in the case of Rack; the Rails integration modifies the Rack span to try to give the trace a prettier name. Instead, we should just elect the resource name on the span after we've assessed what the "best" name is from among the spans in the trace (in this case, likely Rails.) This is why I have suggested choosing a resource name for the profile shouldn't be handled in I would not expect this resource name mutation behavior to happen with other integrations other than Rack, and hopefully soon Rack won't do this either. So if that isn't considered too big of a negative (which it isn't IMO), I would still opt for this change, and try to make this trace resource feature more robust by other means. |
I haven't checked; do you remember if other entry-point-ish integrations (delayed job/sidekiq/resque/sucker punch/sneakers/shoryuken/que/qless/kafka/...) either set the resource at the beginning, or are able to? I'm curious if we'd need to deal with them setting the resource at the end, or if it really is just rack.
To be honest, if it's just for rack, I would suggest an additional alternative: just assume that the resource is a constant for a given span, like the span id and trace id are. This would mean dropping the WDWYT? |
Actually one extra question: what happens if customers are using the rack integration together with rails/grape/etc... Is the root span going to be the rails/grape/etc... span, or the rack one? |
Yes, they are capable of doing so typically as resource is not something discovered post operation, but describes the operation about to occur. This is especially important because if the operation were to fail, we'd want it to tag the resource anyways, so its clear what resource failed: this tagging can only happen before the failure.
The "local root span" will always be the initial span (typically Rack), but these frameworks will modify the resource of this local root span (typically Rack) in order to have it reflect a pretty trace name given some assumptions about how the UI works. This is technically incorrect (i.e. Rack doesn't know anything about Rails routes), even if convenient. It's possible the "local root span" for a trace will be different if the trace crosses execution contexts, that is, if the trace spans processes or threads. This is because we don't copy trace spans between execution contexts, just the trace and parent IDs, thus a new "local root span" is made. This last part about execution contexts may change in the near future, as we change our model for context management. So it isn't safe to make any assumptions about the structure of a trace long term; a feature that is agnostic and sticks to the "active span" API will be much more resilient against such changes. |
I think this is fine. Can you clarify about your desired aggregation behavior in this case? Do you still want to grab the first encountered sample for a trace? (Sans timestamp-based updates?) Or drop the aggregation behavior entirely? |
👍 this makes a lot of sense. I guess only Rack with its "pass the bucket along and let's see who uses it" ends up being weird because it conflates the decision to "own" the request with the actual execution.
Ah... I guess this throws a wrench on my simplification idea. If I'm reading your note correctly, when rails/sinatra/grape are in use alone, then due to (#1626, #1628, #1629) it's guaranteed that if the profiler sees a resource, it would see the correct resource. BUT if the rack integration is enabled alongside those, then there's still a brief period between the rack span being created, and getting to rails/sinatra/grape and those "backfilling" the rack span. That brings us back to needing some "backfilling mechanism", which so far is either what is in master or in this PR. Thanks for the extra info and putting up to the detour, now back to the regular discussion of the contents of this PR 😄 To start with, I'm ok if we break the "only-rack" testcase I posted above, because in general it's a testcase for integrations that only set the resource at the end of the request, and let's assume we can fix all those we care about to set the correct resource at the beginning. Thus, focus right now is the behavior of web framework integrations when combined with rack (I'll call them rack+webf to make it clear from now one that these are what I mean). For longer-running rack+webf spans (I'll put some numbers on what longer means below), the mechanism in master and the mechanism in this PR are equivalent. Even if the profiler samples the rack+webf span before rails/... sets the correct resource name (at the beginning of their own span), it will still take more samples over the lifetime of the request, and thus I'm confident we'll eventually observe the correct resource, and thus this PR's approach of using the last one seen works 💯🚀 . I'm less confident on the approach in this PR for short-running rack+webf spans. Lets get into some numbers: The profiler employs a dynamic sampling rate, which samples up to 100 times per second. But in reality, we sample way less than that: profiling So, what this means for short-running rack+webf spans is that we'll get at most a single sample per span. And as the servers get more loaded, and more threads are used, and the profiler slows down the dynamic sampling rate, the more it's quite possible that this will happen. My concern is therefore: what if for this one sample, we miss the resource? With the mechanism in master, this is handled; we can get the resource at any point we want in the future. But, with the mechanism in this PR, we'll miss the resource. Thus, I would prefer to replace the mechanism in master with something that still allows this "get resource at any point in the future" behavior, although I do realize that we could do more experiments to determine if in practice we still get the resource most of the time, even for short-running rack+webf spans. Thoughts? |
In particular, this doesn't have to look quite like what master has; for instance I'd be happy with providing a callback object that the tracer would call when the root span resource is set, or when the root span is finished, and doing the bookkeeping myself. (or other such ideas that change the way the information is conveyed) |
So compositing together some of the constraints/circumstances we have listed in each of our comments:
I think this is the biggest difference in functional behavior between the two. Some thoughts about the pragmatism of the outcomes here:
All things considered, I'm tempted to say we gain more by adopting this PR in the spirit its written, as I think we would sacrifice little on the usage side. (Still open to more information I hadn't considered on this point, if there's more to be said, but this is my current leaning.) However:
This does remind me we do have a processing pipeline mechanism in which you can add a proc to look at each output span. Could be done when the profiler sets up. I'm not sure this would be good enough, because the pipeline is only triggered when the whole trace is flushed (root span is finished), not when individual spans are closed. (EDIT: actually worse, I think pipeline only runs when the writer runs, as often as every second, so this is probably no good.) As we consider changes to context management for tracing, there might be a similar opportunity to implement callbacks for spans. It might even be acceptable to somehow tie this into The good news is that All that said, I think my order of preference in resolving this is:
Let me know what you think @ivoanjo. |
Not quite :)
Note that this is used as a grouping/slicing mechanism in the profiler side, not as a "blame Rails" mechanism. The stacks will obviously show where exactly in the code the cpu/wall-clock time was spent, so customers will be able to tell that to serve this rails endpoint, that actually a lot of time was spent in some rack middleware they have, even before it gets to Rails.
I'd say this is where tracing and profiling end up separating. We're interested in providing visibility into everything the app is doing, either short or long tasks. If an app does 99.99% short requests, and a few outlier long requests, we want to have information on all of them; hotspots data becomes available for requests longer than 10ms, and for the shorter ones users can look at individual profiles and/or aggregated data.
On closer inspection, my suggestion of "for instance I'd be happy with providing a callback object that the tracer would call when the root span resource is set, or when the root span is finished, and doing the bookkeeping myself" actually would only work if the callback was provided when the resource span is set, not when it finishes. Otherwise, we're back to the rack+webf issue I described -- if the callback is only called at the end of the span, that's already too late. So this
I don't think # 1 is technically feasible at this time; but definitely would not object to it as it would simplify a lot in the client side and effectively allow us to remove the whole gathering tracking resources code. In my opinion, there's a bunch of things in flux in the planning for dd-trace-rb: refactoring spans (+ adding callbacks?), possibly stopping rack resource mutation, etc; and thus I favor in keeping the approach from master until we're far enough into those changes/redesigns to be able to add a replacement mechanism. |
But it hasn't reached Rails in the time period, and from the context of the time period, it may never reach Rails if the Rack middleware short circuits. Thus modifying it retrospectively after the time period when it might be in Rails is revising history, which is why I don't think this is accurate or good practice. If we were aggregating all the samples from a trace's lifetime in the application, it would be more appropriate to look at the set and select the best resource name. It's not really the responsibility of an individual span (Rack) to decide & communicate what the trace name is, as it only has a small set of the trace information (which is why Rack doesn't set a better resource in the first place.)
IMO, this actually reduces the flux by changing Having considered the pros (simpler design, more concurrency-safe code) and cons (an edge case where requests are sampled only during Rack but never during or after Rails) I think the stability trade-offs are worthwhile, so I intend on merging this PR. Thank you for your feedback; I think it's given us a thorough understanding of the behavior here, and will be useful when considering another iteration on this feature. |
What does it change?
This pull request changes profiling to use the most recent event to derive the resource name for a particular trace.
When the pprof encoder for the profiler is given a set of events to export, it reduces them down to a minimal set that represents all the events produced. Since the most recent event for any given trace should reflect the most accurate resource name, we can use that as the canonical resource name for all events on the trace. So during this pprof building process, we can simply find which event is the most recent one for any given trace, and use that for the resource label.
By doing so, it makes the
TraceResourceContainer
obsolete, thus reverts this as well.Why this change?
Overall it's meant to simply the trace internals, and the contract between tracing and profiling, while preserving (and even enhancing) the robustness of this profiling resource feature.
More specifically, the
TraceResourceContainer
has some drawbacks, and this solution is meant to address them:dup
(if we copy a Span, for example.)