Skip to content

Fewer allocations in gem installation#6975

Merged
segiddins merged 1 commit intomasterfrom
segiddins/fewer-allocations-in-gem-installation
Dec 11, 2023
Merged

Fewer allocations in gem installation#6975
segiddins merged 1 commit intomasterfrom
segiddins/fewer-allocations-in-gem-installation

Conversation

@segiddins
Copy link
Contributor

For now, on a small rails app I have hanging around:

==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.

What was the end-user or developer problem that led to this PR?

What is your fix for the problem, implemented in this PR?

Make sure the following tasks are checked


if entry.file?
File.open(destination, "wb") {|out| out.write entry.read }
File.open(destination, "wb") {|out| IO.copy_stream entry, out }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, truffleruby failing in your tests too. I haven't been able to extract a distilled case yet, but @eregon could maybe take a look anyway? I think the issue is that copy_stream is not reading and writing byte for byte identically.

There's layers and layers here, the src entry is a TarReader wrapping a zlib reader, wrapping a TarReader. This change alone causes the bug, if I'm not mistaken.

You may have to revert this part until it's fixed in truffleruby, or found not to be broken?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems likely to be truffleruby/truffleruby#3280 which is now fixed in dev builds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinemde thoughts on introducing a utility method that is IO.copy_stream on all rubies except truffleruby before 23.1.2? Could go into #6958

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great idea. Even fixed it's still broken on older trufflerubies, even if people don't usually update rubygems separately.

Copy link
Contributor

@martinemde martinemde Dec 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add it to #6958. (Or please do if you'd like to get it done and I haven't gotten to it)

gzio.read 16_384 until gzio.eof? # gzip checksum verification
buf = String.new(capacity: 16_384, encoding: Encoding::BINARY)
begin
gzio.readpartial(16_384, buf) until gzio.eof? # gzip checksum verification
Copy link
Contributor

@martinemde martinemde Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gzio read partial returns arbitrarily different length strings on some gzips. Check the pathological gem that I mentioned in the zlib bug. It extracts validly when using read but not readpartial. I don't know how common this problem is.

EDIT: The real bug here is that gzio.readpartial until eof fails right now with Zlib.

Comment on lines 248 to 255
eval <<-RUBY, binding, __FILE__, __LINE__ + 1
# frozen_string_literal: true

def copy_vals(vals)
#{FIELDS.map {|name| "@#{name} = vals[:#{name}]" }.join("; ")}
end
RUBY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely needs a comment about why. This is not your grandma's self documenting code. 👵🏻

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, alternatively, why convert to instance variables at all? Why not store vals directly and iterate FIELDS one time to create attr readers that just lookup the key in the vals hash. vals is mutable and is already allocated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After coming back to this 3 months later, I think I'd prefer we just write out all the assignments just like in ==. The eval feels really complex compared to simple repetition.

@martinemde martinemde force-pushed the segiddins/fewer-allocations-in-gem-installation branch from f50168d to 1488197 Compare October 12, 2023 01:32
@martinemde
Copy link
Contributor

Blocked until code is updated to not use readpartial or Zlib is updated.

@segiddins segiddins force-pushed the segiddins/fewer-allocations-in-gem-installation branch 3 times, most recently from 04b7958 to 8e7c2ca Compare December 8, 2023 02:57
@segiddins
Copy link
Contributor Author

My git scripts messed up, will clean up tomorrow

@segiddins segiddins force-pushed the segiddins/fewer-allocations-in-gem-installation branch 4 times, most recently from dca336c to 4bba03a Compare December 8, 2023 19:53
@segiddins
Copy link
Contributor Author

Sorry for the churn, @martinemde this should be ready for another look

@segiddins segiddins force-pushed the segiddins/fewer-allocations-in-gem-installation branch from 4bba03a to 5a12814 Compare December 9, 2023 00:30
@segiddins segiddins force-pushed the segiddins/fewer-allocations-in-gem-installation branch from 5a12814 to 17dbff2 Compare December 11, 2023 21:08
Copy link
Contributor

@martinemde martinemde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
@segiddins segiddins force-pushed the segiddins/fewer-allocations-in-gem-installation branch from 17dbff2 to f78d45d Compare December 11, 2023 22:11
@segiddins segiddins enabled auto-merge December 11, 2023 23:10
@segiddins segiddins merged commit 8f1557d into master Dec 11, 2023
@segiddins segiddins deleted the segiddins/fewer-allocations-in-gem-installation branch December 11, 2023 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants