Closed
Description
openedon Jun 27, 2024
part of epic #7386
bit of prior discussion in https://neondb.slack.com/archives/C033RQ5SPDH/p1719411245662839
InMemoryLayer::get_values_reconstruct_data
uses read_blob
, which internally uses the PageCache for block access.
Switch it to vectored reads that bypass the PageCache.
However, we want to deliver equivalent performance compared to the current code in the case where the current code, in one call, reads multiple blobs from the same 8kb EphemeralFile page.
Strategy for this (planned together with @VladLazar ):
- store the blob lengths in the in-memory btree
- avoid consuming more memory space by using u32 instead of u64 for offset. u32 is enough if we cap EphemeralFile to 4GiB, which is way larger than we want it to go anyways 3.
- Get rid of the whole blob_io business for InMemoryLayer, we don't need it if we store offset and length in the in-memory index.
- For
get_values_reconstruct_data
, feed the(offset, length)
pairs directly into theVectoredReadBuilder
(after sorting them in offset order, so the builder can merge adjacent blob reads as needed)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Metadata
Assignees
Labels
No labels