-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[profiling] Associate tracing span IDs with samples in the wall profiler #3371
Conversation
Overall package sizeSelf size: 4.88 MB Dependency sizes
🤖 This report was automatically generated by heaviest-objects-in-the-universe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one minor nit.
durationMillis: this._flushIntervalMillis, | ||
sourceMapper: this._mapper, | ||
customLabels: this._codeHotspotsEnabled, | ||
lineNumbers: false }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lineNumbers: false }) | |
lineNumbers: false | |
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
I already asked but I forgot what was the answer: why isn't it flagged by linter? which auto-formatter should I use to make sure that code complies with repo coding standards ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're probably missing an eslint config for it. https://eslint.org/docs/latest/rules/object-curly-newline appears to be what we're looking for?
ea5034f
to
7e4029a
Compare
Codecov Report
@@ Coverage Diff @@
## master #3371 +/- ##
===========================================
+ Coverage 65.12% 84.13% +19.01%
===========================================
Files 205 211 +6
Lines 7794 8328 +534
Branches 33 33
===========================================
+ Hits 5076 7007 +1931
+ Misses 2718 1321 -1397
... and 66 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
BenchmarksBenchmark execution time: 2023-07-13 13:43:30 Comparing candidate commit 1b00677 in PR branch Found 0 performance improvements and 21 performance regressions! Performance is the same for 449 metrics, 22 unstable metrics. scenario:exporting-pipeline-0.4-16
scenario:exporting-pipeline-0.4-18
scenario:exporting-pipeline-0.4_with_stats-16
scenario:exporting-pipeline-0.4_with_stats-18
scenario:exporting-pipeline-0.5-16
scenario:exporting-pipeline-0.5-18
scenario:exporting-pipeline-0.5_with_stats-16
scenario:exporting-pipeline-0.5_with_stats-18
|
616ce46
to
7338234
Compare
b0d8686
to
42e0d49
Compare
if (this._profilerState[kSampleCount] !== 0) { | ||
this._profilerState[kSampleCount] = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of resetting here, I'd rather keep a last sample count and compare to it
const sampleCount = this._profilerState[kSampleCount]
if (sampleCount != this._lastSampleCount) {
...
}
this._lastSampleCount = sampleCount
This way JS code never writes to it, only reads it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I like that !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
42e0d49
to
72adedf
Compare
Hide functionality behind DD_PROFILING_EXPERIMENTAL_CODEHOTSPOTS_ENABLED and DD_PROFILING_EXPERIMENTAL_ENDPOINT_COLLECTION_ENABLED flags. Profiler tracks changes of active span by subscribing to async hooks before channel and to AsyncResourceStorage enter channel that notifies when current store is changed with enterWith/run. Profiler remember previous span and previous started spans, upon change of active span, it checks if a sample has been taken by native profiler addon, then if that's the case, it updates the context of the sample with span id and root span id computed from previous span / started spans. It also add the tags from the last web span to the context to allow computation of endpoint later on.
Profiler may need the tags to determine span resource name for endpoint aggregation: profiler keeps a reference on web span that was active when each profiling sample is taken. Then during profile export, span tags are used to determine endpoint. Endpoint cannot be determined right away because sometimes tags necessary to determine endpoint are only set just before span is finished.
For web spans, add request tags just after span creation instead of when span finishes. This is needed for profiler code hotspots implementation to determine which span are web spans and should be used to find current endpoint.
72adedf
to
1b00677
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
🚢
Benchmark regression is expected: |
@@ -5,6 +5,7 @@ const { channel } = require('../../../diagnostics_channel') | |||
|
|||
const beforeCh = channel('dd-trace:storage:before') | |||
const afterCh = channel('dd-trace:storage:after') | |||
const enterCh = channel('dd-trace:storage:enter') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI, we're aiming long-term to eliminate use of async_hooks, so it would be ideal if we can eliminate these async_hooks lifecycle event channels. Not going to block this now as this will be a longer-term effort, but it's worth thinking about how you can do this without these channels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dd-trace:storage:enter
is not async hook specific per se, its event fires when the value in the DD wrapper for AsyncResourceStorage
is synchronously modified. I know AsyncResourceStorage
is part of the async hooks library, but if we started using Node's AsyncLocalStorage
under the hood for DD storage, we could still be emitting this enter event.
That said, I don't see us being able to avoid observing async context switches with the before
event as we need to capture the active span on every context switch in case we need to associate it with a taken sample.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it will be possible to wrap the run
method for our use of AsyncLocalStorage
to get the enter event, but the before
event will likely go away or change substantially at some point so we should be thinking about how we can achieve that in some other way.
It's worth noting that my current work rewriting AsyncLocalStorage
is to implement it on the native side which could mean we expose a native interface to it which could enable you to just track your data in an AsyncLocalStorage
instance and access that from the C++ side on your end whenever you need it. It would mean needing to enter the isolate so if you're doing that in a signal it probably needs verification that it can function safely. If you can review nodejs/node#48528 to see if that works for you, that would be a big help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting! Looking at implementation of AsyncContextFrame::current
and AsyncContextFrame::get
it might be safe to call from a signal handler. We'll need to be storing a v8::Global
on native side for the key and convert it in a v8::Local
for the AsyncContextFrame::get
call. So we'd still maybe the enter
event, and derive some data from the span into our own ALS. Unfortunately, we'll probably lose the ability to do lazy backfilling of our sampling context like we now do on context switches, although considering we'll only have to do this when a logical async context starts, and not on every switch it might work well.
…ler (#3371) * Initial code for code hotspots and endpoint aggregation in wall profiler Hide functionality behind DD_PROFILING_EXPERIMENTAL_CODEHOTSPOTS_ENABLED and DD_PROFILING_EXPERIMENTAL_ENDPOINT_COLLECTION_ENABLED flags. Profiler tracks changes of active span by subscribing to async hooks before channel and to AsyncResourceStorage enter channel that notifies when current store is changed with enterWith/run. Profiler remembers previous span and previous started spans, upon change of active span, it checks if a sample has been taken by native profiler addon, then if that's the case, it updates the context of the sample with span id and root span id computed from previous span / started spans. It also add the tags from the last web span to the context to allow computation of endpoint later on. Minor changes to other parts of the code to make it all work: === * Report error if start fails * Avoid clearing span context tags upon export Profiler may need the tags to determine span resource name for endpoint aggregation: profiler keeps a reference on web span that was active when each profiling sample is taken. Then during profile export, span tags are used to determine endpoint. Endpoint cannot be determined right away because sometimes tags necessary to determine endpoint are only set just before span is finished. * Add request tags just after span creation For web spans, add request tags just after span creation instead of when span finishes. This is needed for profiler code hotspots implementation to determine which span are web spans and should be used to find current endpoint.
…ler (#3371) * Initial code for code hotspots and endpoint aggregation in wall profiler Hide functionality behind DD_PROFILING_EXPERIMENTAL_CODEHOTSPOTS_ENABLED and DD_PROFILING_EXPERIMENTAL_ENDPOINT_COLLECTION_ENABLED flags. Profiler tracks changes of active span by subscribing to async hooks before channel and to AsyncResourceStorage enter channel that notifies when current store is changed with enterWith/run. Profiler remembers previous span and previous started spans, upon change of active span, it checks if a sample has been taken by native profiler addon, then if that's the case, it updates the context of the sample with span id and root span id computed from previous span / started spans. It also add the tags from the last web span to the context to allow computation of endpoint later on. Minor changes to other parts of the code to make it all work: === * Report error if start fails * Avoid clearing span context tags upon export Profiler may need the tags to determine span resource name for endpoint aggregation: profiler keeps a reference on web span that was active when each profiling sample is taken. Then during profile export, span tags are used to determine endpoint. Endpoint cannot be determined right away because sometimes tags necessary to determine endpoint are only set just before span is finished. * Add request tags just after span creation For web spans, add request tags just after span creation instead of when span finishes. This is needed for profiler code hotspots implementation to determine which span are web spans and should be used to find current endpoint.
…ler (#3371) * Initial code for code hotspots and endpoint aggregation in wall profiler Hide functionality behind DD_PROFILING_EXPERIMENTAL_CODEHOTSPOTS_ENABLED and DD_PROFILING_EXPERIMENTAL_ENDPOINT_COLLECTION_ENABLED flags. Profiler tracks changes of active span by subscribing to async hooks before channel and to AsyncResourceStorage enter channel that notifies when current store is changed with enterWith/run. Profiler remembers previous span and previous started spans, upon change of active span, it checks if a sample has been taken by native profiler addon, then if that's the case, it updates the context of the sample with span id and root span id computed from previous span / started spans. It also add the tags from the last web span to the context to allow computation of endpoint later on. Minor changes to other parts of the code to make it all work: === * Report error if start fails * Avoid clearing span context tags upon export Profiler may need the tags to determine span resource name for endpoint aggregation: profiler keeps a reference on web span that was active when each profiling sample is taken. Then during profile export, span tags are used to determine endpoint. Endpoint cannot be determined right away because sometimes tags necessary to determine endpoint are only set just before span is finished. * Add request tags just after span creation For web spans, add request tags just after span creation instead of when span finishes. This is needed for profiler code hotspots implementation to determine which span are web spans and should be used to find current endpoint.
…ler (#3371) * Initial code for code hotspots and endpoint aggregation in wall profiler Hide functionality behind DD_PROFILING_EXPERIMENTAL_CODEHOTSPOTS_ENABLED and DD_PROFILING_EXPERIMENTAL_ENDPOINT_COLLECTION_ENABLED flags. Profiler tracks changes of active span by subscribing to async hooks before channel and to AsyncResourceStorage enter channel that notifies when current store is changed with enterWith/run. Profiler remembers previous span and previous started spans, upon change of active span, it checks if a sample has been taken by native profiler addon, then if that's the case, it updates the context of the sample with span id and root span id computed from previous span / started spans. It also add the tags from the last web span to the context to allow computation of endpoint later on. Minor changes to other parts of the code to make it all work: === * Report error if start fails * Avoid clearing span context tags upon export Profiler may need the tags to determine span resource name for endpoint aggregation: profiler keeps a reference on web span that was active when each profiling sample is taken. Then during profile export, span tags are used to determine endpoint. Endpoint cannot be determined right away because sometimes tags necessary to determine endpoint are only set just before span is finished. * Add request tags just after span creation For web spans, add request tags just after span creation instead of when span finishes. This is needed for profiler code hotspots implementation to determine which span are web spans and should be used to find current endpoint.
…ler (#3371) * Initial code for code hotspots and endpoint aggregation in wall profiler Hide functionality behind DD_PROFILING_EXPERIMENTAL_CODEHOTSPOTS_ENABLED and DD_PROFILING_EXPERIMENTAL_ENDPOINT_COLLECTION_ENABLED flags. Profiler tracks changes of active span by subscribing to async hooks before channel and to AsyncResourceStorage enter channel that notifies when current store is changed with enterWith/run. Profiler remembers previous span and previous started spans, upon change of active span, it checks if a sample has been taken by native profiler addon, then if that's the case, it updates the context of the sample with span id and root span id computed from previous span / started spans. It also add the tags from the last web span to the context to allow computation of endpoint later on. Minor changes to other parts of the code to make it all work: === * Report error if start fails * Avoid clearing span context tags upon export Profiler may need the tags to determine span resource name for endpoint aggregation: profiler keeps a reference on web span that was active when each profiling sample is taken. Then during profile export, span tags are used to determine endpoint. Endpoint cannot be determined right away because sometimes tags necessary to determine endpoint are only set just before span is finished. * Add request tags just after span creation For web spans, add request tags just after span creation instead of when span finishes. This is needed for profiler code hotspots implementation to determine which span are web spans and should be used to find current endpoint.
…ler (#3371) * Initial code for code hotspots and endpoint aggregation in wall profiler Hide functionality behind DD_PROFILING_EXPERIMENTAL_CODEHOTSPOTS_ENABLED and DD_PROFILING_EXPERIMENTAL_ENDPOINT_COLLECTION_ENABLED flags. Profiler tracks changes of active span by subscribing to async hooks before channel and to AsyncResourceStorage enter channel that notifies when current store is changed with enterWith/run. Profiler remembers previous span and previous started spans, upon change of active span, it checks if a sample has been taken by native profiler addon, then if that's the case, it updates the context of the sample with span id and root span id computed from previous span / started spans. It also add the tags from the last web span to the context to allow computation of endpoint later on. Minor changes to other parts of the code to make it all work: === * Report error if start fails * Avoid clearing span context tags upon export Profiler may need the tags to determine span resource name for endpoint aggregation: profiler keeps a reference on web span that was active when each profiling sample is taken. Then during profile export, span tags are used to determine endpoint. Endpoint cannot be determined right away because sometimes tags necessary to determine endpoint are only set just before span is finished. * Add request tags just after span creation For web spans, add request tags just after span creation instead of when span finishes. This is needed for profiler code hotspots implementation to determine which span are web spans and should be used to find current endpoint.
What does this PR do?
Uses the new functionality of wall profiler in DataDog/pprof-nodejs#105 to associate wall profiler samples with span IDs.
Motivation
Implement the code hotspots (ability to link traces to profiling samples) for wall profiler.
Additional Notes
Includes #3368.
Profiler takes samples through a signal handler and needs to retrieve the current span ID and local root span ID in this signal handler.
The current design is to notify profiler on tracer side of all changes of current span. Change of active span can occur for two different reasons:
before/after
and publishing todd-trace:storage:before
anddd-trace:storage:after
channels.AsyncResourceStorage.enterWith
orAsyncResourceStorage.run
: notification is done throughdd-trace:storage:enter
channel.Upon notification and if a sample has been taken by profiler since last notification, profiler tracer side calls profiler native side to update span id / root span id of previous sample.
Implementation notes:
Open questions:
AsyncResourceStorage.enterWith
/AsyncResourceStorage.run
to detect active span changes ?