Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NO-TICKET] Optimize CodeProvenance#record_loaded_files to avoid allocations #3757

Merged
merged 1 commit into from
Jul 8, 2024

Conversation

ivoanjo
Copy link
Member

@ivoanjo ivoanjo commented Jul 4, 2024

What does this PR do?

This PR replaces a Hash#find with a Hash#any? in the Profiler CodeProvenance collector.

This seemingly innocuous change actually gets rid of a bunch of memory allocations.

It turns out that when Hash#find is being used, Ruby allocates an array to contain the [key, value] pair that's passed into the block, for each entry in the hash (on top of a few more objects -- but this is the biggest offender unless the hash is empty/almost empty).

The VM actually has an optimization for a lot of operations, including Hash#any? that avoids allocating any extra memory, but because Hash#find is actually inherited from Enumerable, this optimization does not kick in.

Motivation:

Eliminate unneeded memory allocations.

Additional Notes:

Here's a very simple reproducer that shows the issue:

puts RUBY_DESCRIPTION

def allocated_now; GC.stat(:total_allocated_objects); end

hash = {a: 1, b: 2, c: 3}
v = nil

before_find = allocated_now
10.times { _, v = hash.find { |k, _| k == :c } }
puts "Found value #{v}, allocated #{allocated_now - before_find}"

before_any = allocated_now
10.times { v = nil; hash.any? { |k, val| (v = val) if k == :c } }

puts "Found value #{v}, allocated #{allocated_now - before_any}"

and here's how it looks on the latest Ruby version:

ruby 3.4.0preview1 (2024-05-16 master 9d69619623) [x86_64-linux]
Found value 3, allocated 64
Found value 3, allocated 1

(Note that there's a few more allocations going on with Hash#find, not only the arrays for the pairs, and we get rid of them ALL!)

Here's a quick comparison of the internal prof-fibonacci test service with/without this change (+ sampling every allocation):

Before
image
After
image

How to test the change?

This operation already has test coverage. I considered adding some counting of allocated objects, but we've had quite a bunch of flakiness in some of the profiler specs when counting objects, so I've decided to not add any performance-specific tests for this.

…cations

**What does this PR do?**

This PR replaces a `Hash#find` with a `Hash#any?` in the Profiler
`CodeProvenance` collector.

This seemingly innocuous change actually gets rid of a bunch of memory
allocations.

It turns out that when `Hash#find` is being used, Ruby allocates an
array to contain the `[key, value]` pair that's passed into the block,
for each entry in the hash (on top of a few more objects -- but this is
the biggest offender unless the hash is empty/almost empty).

The VM actually has an optimization for a lot of operations, including
`Hash#any?` that avoids allocating any extra memory, but because
`Hash#find` is actually inherited from `Enumerable`, this optimization
does not kick in.

**Motivation:**

Eliminate unneeded memory allocations.

**Additional Notes:**

Here's a very simple reproducer that shows the issue:

```
puts RUBY_DESCRIPTION

def allocated_now; GC.stat(:total_allocated_objects); end

hash = {a: 1, b: 2, c: 3}
v = nil

before_find = allocated_now
10.times { _, v = hash.find { |k, _| k == :c } }
puts "Found value #{v}, allocated #{allocated_now - before_find}"

before_any = allocated_now
10.times { v = nil; hash.any? { |k, val| (v = val) if k == :c } }

puts "Found value #{v}, allocated #{allocated_now - before_any}"
```

and here's how it looks on the latest Ruby version:

```
ruby 3.4.0preview1 (2024-05-16 master 9d69619623) [x86_64-linux]
Found value 3, allocated 64
Found value 3, allocated 1
```

(Note that there's a few more allocations going on with `Hash#find`,
not only the arrays for the pairs, and we get rid of them ALL!)

**How to test the change?**

This operation already has test coverage. I considered adding some
counting of allocated objects, but we've had quite a bunch of
flakiness in some of the profiler specs when counting objects, so
I've decided to not add any performance-specific tests for this.
@ivoanjo ivoanjo requested a review from a team as a code owner July 4, 2024 13:00
@github-actions github-actions bot added the profiling Involves Datadog profiling label Jul 4, 2024
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.91%. Comparing base (849c60c) to head (f0be554).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #3757   +/-   ##
=======================================
  Coverage   97.91%   97.91%           
=======================================
  Files        1241     1241           
  Lines       74632    74632           
  Branches     3605     3605           
=======================================
  Hits        73074    73074           
  Misses       1558     1558           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ivoanjo
Copy link
Member Author

ivoanjo commented Jul 8, 2024

I've reported this upstream to https://bugs.ruby-lang.org/issues/20608 and there's ruby/ruby#11110 to improve this :)

@ivoanjo ivoanjo merged commit cf1f062 into master Jul 8, 2024
168 checks passed
@ivoanjo ivoanjo deleted the ivoanjo/avoid-code-provenance-allocations branch July 8, 2024 08:41
@github-actions github-actions bot added this to the 2.2.0 milestone Jul 8, 2024
@TonyCTHsu TonyCTHsu mentioned this pull request Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Involves Datadog profiling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants