-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pageserver: handle decompression outside vectored read_blobs
#8942
pageserver: handle decompression outside vectored read_blobs
#8942
Conversation
Signed-off-by: Yuchen Liang <yuchen@neon.tech>
44793b7
to
5f5653a
Compare
4986 tests run: 4822 passed, 0 failed, 164 skipped (full report)Flaky tests (6)Postgres 17
Postgres 16
Postgres 15
Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
67bd614 at 2024-09-24T16:56:24.080Z :recycle: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding @koivunej's #8942 (comment): I think if decompress
took a Cow<Bytes>
instead of the Option<Bytes>
it would make the consuming code paths more concise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose appending to the buffer in read_blobs
because I didn't want to make compression someone has to explicitly think about when using the reader interface.
Because if it is something someone has to explicitly think about, mistakes can happen, and people can forget about it, resulting in the code working on compressed data. And it is more cumbersome to work with the API.
Therefore, I agree with Christian's suggestion to add a function to VectoredBlob
to allow reading of the blobs, and returning something like Cow<Bytes, &[u8]>
(or a custom enum if that doesn't work).
Ideally one would make the members of VectoredBlob
private so that one can't mistakenly forget about calling the decompress
function on it.
Signed-off-by: Yuchen Liang <yuchen@neon.tech>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Orca Security Scan Summary
Status | Check | Issues by priority | |
---|---|---|---|
Passed | Infrastructure as Code | 0 0 0 0 | View in Orca |
Passed | Secrets | 0 0 0 0 | View in Orca |
Passed | Vulnerabilities | 0 0 0 0 | View in Orca |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better now, thanks for adjusting it. It's well done work.
Signed-off-by: Yuchen Liang <yuchen@neon.tech>
Signed-off-by: Yuchen Liang <yuchen@neon.tech>
Signed-off-by: Yuchen Liang <yuchen@neon.tech>
Signed-off-by: Yuchen Liang <yuchen@neon.tech>
Signed-off-by: Yuchen Liang <yuchen@neon.tech>
Part of #8130.
Problem
Currently, decompression is performed within the
read_blobs
implementation and the decompressed blob will be appended to the end of theBytesMut
buffer. We will lose this flexibility of extending the buffer when we switch to using our own dio-aligned buffer (WIP in #8730). To facilitate the adoption of aligned buffer, we need to refactor the code to perform decompression outsideread_blobs
.Summary of changes
VectoredBlobReader::read_blobs
will returnVectoredBlob
without performing decompression and appending decompressed blob. It becomes the caller's responsibility to decompress the buffer.BufView
type that functions asCow<Bytes, &[u8]>
.VectoredBlob::read
so that people don't have to explicitly thinking about compression when using the reader interface.Checklist before requesting a review
Checklist before merging