Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROF-7307] Wire up allocation sampling into CpuAndWallTimeWorker #3103

Merged
merged 3 commits into from
Sep 5, 2023

Conversation

ivoanjo
Copy link
Member

@ivoanjo ivoanjo commented Sep 1, 2023

What does this PR do?

In #2657 we added support for the Collectors::ThreadContext to sample a given object allocation.

But until now, triggering these kinds of allocations was only done in our test suite.

This PR adds the missing part: actually sampling objects as Ruby allocates them.

In the Collectors::CpuAndWallTimeWorker, we already had a TracePoint that is called on every new object allocation (RUBY_INTERNAL_EVENT_NEWOBJ), but we were only using it to count allocations.

This PR changes the TracePoint code to actually call into the Collectors::ThreadContext and thus allow allocation sampling to happen.

(Also included is a crash fix + another tiny tweak. I suggest reviewing commit-by-commit as it makes more sense!)

Motivation:

This PR is another step towards the allocation profiling feature.

Additional Notes:

It's not yet possible to turn on allocation profiling via configuration.

I am reserving that change for later. But it's easy to do: change the profiling/component.rb to use alloc_samples_enabled: true and allocation_sample_every: 1.

This will result in a big overhead impact to the application. I'll be working on this in later PRs.

In particular, the sampling approach we're using (allocation_sample_every), is only a placeholder; it will get replaced before we get to beta. This is why I don't want to add it as a real user-available setting -- because it's temporary.

How to test the change?

This change includes test coverage. You can also manually check it out, by following the instructions in the "Additional Notes".

(There's a feature flag needed for this data to show up in the Datadog UX).

**What does this PR do?**

In #2657 we added support for the `Collectors::ThreadContext` to
sample a given object allocation.

But until now, triggering these kinds of allocations was only done in
our test suite.

This PR adds the missing part: actually sampling objects as Ruby
allocates them.

In the `Collectors::CpuAndWallTimeWorker`, we already
had a `TracePoint` that is called on every new object allocation
(`RUBY_INTERNAL_EVENT_NEWOBJ`), but we were only using it to
count allocations.

This PR changes the `TracePoint` code to actually call into the
`Collectors::ThreadContext` and thus allow allocation sampling to
happen.

**Motivation:**

This PR is another step towards the allocation profiling feature.

**Additional Notes:**

It's not yet possible to turn on allocation profiling via configuration.

I am reserving that change for later. But it's easy to do:
change the `profiling/component.rb` to use `alloc_samples_enabled: true`
and `allocation_sample_every: 1`.

This will result in a big overhead impact to the application. I'll be
working on this in later PRs.

In particular, the sampling approach we're using
(`allocation_sample_every`), is only a placeholder; it will get
replaced before we get to beta. This is why I don't want to add it
as a real user-available setting -- because it's temporary.

**How to test the change?**

This change includes test coverage. You can also manually check it out,
by following the instructions in the "Additional Notes".

(There's a feature flag needed for this data to show up in the
Datadog UX).
Today I learned that sometimes Ruby allocates objects without a class!

See notes in diff for more details.
These show up from time to time and will be user-visible, so it's nice
to have a name for them.

We could consider hiding them, but my thinking is: what if there's
something in the user's code that causes a lot of them to be created?

That seems relevant to report on, even if they are VM objects, since
the user will probably want to change their code to avoid this.
@ivoanjo ivoanjo requested a review from a team September 1, 2023 12:01
@github-actions github-actions bot added the profiling Involves Datadog profiling label Sep 1, 2023
if (name_length > 0) {
class_name = (ddog_CharSlice) {.ptr = name, .len = name_length};
} else {
// @ivoanjo: I'm not sure this can ever happen, but just-in-case
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for this to happen with anonymous classes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! It shouldn't happen -- we get the Ruby autogenerated name for them:

[1] pry(main)> Class.new.to_s
=> "#<Class:0x00005b721ef2a558>"

Here's the output of an actual profile with such a class:

allocation class:[#<Class:0x00005dd2c7da98a8>] ruby vm type:[T_OBJECT] thread id:[123612321531712 (2560)] thread name:[main]

@ivoanjo ivoanjo merged commit 3e3ad39 into master Sep 5, 2023
164 checks passed
@ivoanjo ivoanjo deleted the ivoanjo/prof-7307-wire-up-allocation-sampling branch September 5, 2023 15:22
@github-actions github-actions bot added this to the 1.15.0 milestone Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Involves Datadog profiling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants