Skip to content

Conversation

@brian-pane
Copy link

This matches the semantics of zlib-ng: when we allocate a double-sized input buffer in write mode, normal gz write functions are only allowed to write to the front half of the buffer. The remaining space is reserved for gzprintf.

@codecov
Copy link

codecov bot commented May 31, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Flag Coverage Δ
fuzz-compress ?
fuzz-decompress ?
test-aarch64-apple-darwin 93.36% <100.00%> (-0.02%) ⬇️
test-x86_64-apple-darwin 91.69% <100.00%> (-0.01%) ⬇️
test-x86_64-unknown-linux-gnu 90.47% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
libz-rs-sys-cdylib/src/gz.rs 91.21% <100.00%> (+0.11%) ⬆️

... and 6 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@brian-pane brian-pane mentioned this pull request May 31, 2025
@folkertdev
Copy link
Member

you can cherry-pick c74a62e to fix those CI issues.

There are a bunch of size * 2 and size << 1 in the zlib-ng codebase that (as far as I can tell) we're missing now. e.g. here https://github.com/zlib-ng/zlib-ng/blob/develop/gzread.c.in#L229. I think we need to add those now?

@brian-pane
Copy link
Author

That particular line of zlib-ng is generating an offset relative to the out buffer. It's in a function that's only used in read mode, which means that the out buffer's size is actually 2 * size. The zlib-rs equivalent uses out_size, which is set once to the actual size of the output buffer for the mode, instead of the magic size * 2. I found it easier to reason about the bounds checks by using out_size (which always matches the allocated output buf size) instead of 2 * size in read operations and size in write operations. That part is unchanged with this PR.

The same used to apply to in_size: it was always the size of the allocated input buffer. With this patch, that's no longer the case. Now the semantics are:

  • In read mode, in_size is the full size of the allocated input buffer, which matches what size represents in zlib-ng.
  • In write mode, in_size is half the size of the allocated input buffer, which also matches what size represents in zlib-ng.

So for read operations, which don't get a double-size input buffer, we don't need to double in_size. For write operations, we usually don't need to double it, since we're only half-filling the buffer. There are three exception cases where we need to refer to in_size * 2:

  • When allocating the input buffer in write mode, we need to double the requested size. This PR does that by using the new in_capacity() to compute the allocation size.
  • When deallocating the input buffer, we use in_capacity() again to match the allocation.
  • gzprintf, as the one thing allowed to write beyond the halfway point, will need to use in_capacity() instead of in_size to compute the total available buffer space.

(cherry picked from commit c74a62e)
Copy link
Member

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks!

@folkertdev folkertdev merged commit 5fa37d3 into trifectatechfoundation:main Jun 1, 2025
24 checks passed
@brian-pane brian-pane deleted the gz-alloc branch July 26, 2025 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants