-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROF-6558] Show profiler overhead in flamegraph for CPU Profiling 2.0 #2607
Merged
ivoanjo
merged 9 commits into
master
from
ivoanjo/prof-6558-expose-profiler-overhead-v2
Feb 14, 2023
Merged
[PROF-6558] Show profiler overhead in flamegraph for CPU Profiling 2.0 #2607
ivoanjo
merged 9 commits into
master
from
ivoanjo/prof-6558-expose-profiler-overhead-v2
Feb 14, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I got tired of the extra boilerplate from using hashes and introduced two small structs to contain the parsed profile data.
This will enable it to be reused when we want to add the profiler overhead as a stack.
**What does this PR do?**: This PR changes the cpu-time/wall-time sampling in the CPU Profiling 2.0 profiler to expose the profiler overhead as part of the captured data. Specifically, during sampling, the profiler now measures how much cpu-time/wall-time is spent by the profiler itself, and adds that to the profiler as a separate stack (that shows up as belonging to dd-trace-rb). **Motivation**: In the old Ruby profiler, the profiler always worked from a background thread. Because we also sampled that background thread, the flamegraph included the overhead of the profiler -- it was the overhead of the background thread. With CPU Profiling 2.0, sampling is done in whatever thread is holding the Global VM Lock, which means that this part of the profiler overhead was not being exposed. With this change, this impact on code is now accounted for. **Additional Notes**: I was torn between a few options, and decided to use the stack of the `CpuAndWallTimeWorker` thread to represent the overhead of the profiler. Thus, when this stack shows up in another thread, it represents the overhead of the profiler on that thread. I could have gone with a placeholder (e.g. just a frame that says "Profiler Sampling" or something similar), but such placeholders break the "Only My Code" feature. Using a stack from a different thread than the one we're supposedly "sampling" adds a bit of complexity; but we can always undo this and go with the placeholder in this becomes problematic. **How to test the change?**: Change includes code coverage.
Ooops! I missed updating this spec since it doesn't run on all Rubies.
Because the new feature relies on adding extra samples to the flamegraph, the ```ruby expect(stats.fetch(:signal_handler_enqueued_sample)).to be >= sample_count ``` assertion was failing.
This will be useful when we want to process the profile and either focus on or disregard these samples.
The profiler overhead frames show up non-deterministically (because they show up in whatever thread has the Global VM Lock), so they can cause a lot of the specs to become flaky. Excluding them should be enough to avoid the issue.
This seems ok. I agree that there's no straightforward lightweight solution here. |
marcotc
approved these changes
Feb 13, 2023
Thanks Marco for the review! :) Merging away! |
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?:
This PR changes the cpu-time/wall-time sampling in the CPU Profiling 2.0 profiler to expose the profiler overhead as part of the captured data.
Specifically, during sampling, the profiler now measures how much cpu-time/wall-time is spent by the profiler itself, and adds that to the profiler as a separate stack (that shows up as belonging to dd-trace-rb).
Motivation:
In the old Ruby profiler, the profiler always worked from a background thread. Because we also sampled that background thread, the flamegraph included the overhead of the profiler -- it was the overhead of the background thread.
With CPU Profiling 2.0, sampling is done in whatever thread is holding the Global VM Lock, which means that this part of the profiler overhead was not being exposed. With this change, this impact on code is now accounted for.
Additional Notes:
I was torn between a few options, and decided to use the stack of the
CpuAndWallTimeWorker
thread to represent the overhead of the profiler.Thus, when this stack shows up in another thread, it represents the overhead of the profiler on that thread.
I could have gone with a placeholder (e.g. just a frame that says "Profiler Sampling" or something similar), but such placeholders break the "Only My Code" feature.
Using a stack from a different thread than the one we're supposedly "sampling" adds a bit of complexity; but we can always undo this and go with the placeholder in this becomes problematic.
How to test the change?:
Change includes code coverage.