Skip to content

bypass PageCache for InMemoryLayer::get_values_reconstruct_data #8183

Closed

Description

part of epic #7386

bit of prior discussion in https://neondb.slack.com/archives/C033RQ5SPDH/p1719411245662839


InMemoryLayer::get_values_reconstruct_data uses read_blob, which internally uses the PageCache for block access.

Switch it to vectored reads that bypass the PageCache.

However, we want to deliver equivalent performance compared to the current code in the case where the current code, in one call, reads multiple blobs from the same 8kb EphemeralFile page.

Strategy for this (planned together with @VladLazar ):

  1. store the blob lengths in the in-memory btree
  • avoid consuming more memory space by using u32 instead of u64 for offset. u32 is enough if we cap EphemeralFile to 4GiB, which is way larger than we want it to go anyways 3.
  1. Get rid of the whole blob_io business for InMemoryLayer, we don't need it if we store offset and length in the in-memory index.
  2. For get_values_reconstruct_data, feed the (offset, length) pairs directly into the VectoredReadBuilder (after sorting them in offset order, so the builder can merge adjacent blob reads as needed)

Tasks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions