Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gzip: Allocate gzip buffers from storage #4307

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

dridi
Copy link
Member

@dridi dridi commented Mar 31, 2025

Arbitrary allocation from storage was introduced to solve the h2
head-of-line blocking caused by early DATA frames, allowed by the
initial control flow window of new streams. The decision could have
been made then to allocate the window buffer from the heap, but it
was tied to storage instead, offering better control over the global
memory footprint of the cache process.

The gzip buffer is tied to a task, but too large to allocate from
workspace without posing a risk. Even worse, there may be multiple
concurrent gzip operations in a single task. For example a gunzip
needed to parse ESI followed by a gzip to store a compressed body.

For a similar reason, it was not opportune to allocate the h2 stream
window buffer from workspace, despite being tied to the task.

Following the same logic, the gzip allocation can be performed from
storage to remove one wild card in our heap consumption. Another
possibility could have been the addition of a mempool, and it could
also have been an alternative for the h2 stream window, but transient
storage seemed more appropriate and it matches the on-demand malloc
behavior.

The gzip buffer logic could have been better encapsulated, but the
amount of direct access to the m_buf field would have resulted in a
lot of noise. Most of the noise here is caused by the two function
signatures changed to take a worker argument.

dridi added 3 commits March 31, 2025 13:59
Unlike the input buffer, the output buffer must be writable to be
usable. The pointer used to be const, but the restriction was lifted
without touching the macro.

Refs e633a7f
Arbitrary allocation from storage was introduced to solve the h2
head-of-line blocking caused by early DATA frames, allowed by the
initial control flow window of new streams. The decision could have
been made then to allocate the window buffer from the heap, but it
was tied to storage instead, offering better control over the global
memory footprint of the cache process.

The gzip buffer is tied to a task, but too large to allocate from
workspace without posing a risk. Even worse, there may be multiple
concurrent gzip operations in a single task. For example a gunzip
needed to parse ESI followed by a gzip to store a compressed body.

For a similar reason, it was not opportune to allocate the h2 stream
window buffer from workspace, despite being tied to the task.

Following the same logic, the gzip allocation can be performed from
storage to remove one wild card in our heap consumption. Another
possibility could have been the addition of a mempool, and it could
also have been an alternative for the h2 stream window, but transient
storage seemed more appropriate and it matches the on-demand malloc
behavior.

The gzip buffer logic could have been better encapsulated, but the
amount of direct access to the m_buf field would have resulted in a
lot of noise. Most of the noise here is caused by the two function
signatures changed to take a worker argument.
@bsdphk
Copy link
Contributor

bsdphk commented Mar 31, 2025

Good idea: +1

Copy link
Member

@nigoroll nigoroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, just a minor point: Do we really benefit from keeping m_buf?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants