Skip to content

bypass PageCache for L0 flush #7418

Closed
Closed

Description

Currently, when we do an InMemoryLayer::write_to_disk, there is a tremendous amount of random read I/O, as deltas from the ephemeral file (written in LSN order) are written out to the delta layer in key order.

In benchmarks (#7409) we can see that this delta layer writing phase is substantially more expensive than the initial ingest of data, and that within the delta layer write a significant amount of the CPU time is spent traversing the page cache.

It's really slow: like tens of megabytes per second on a fast desktop CPU.

Since this is a background task whose concurrency we can limit, we can simplify and accelerate this by doing the whole thing in memory:

  • Read the full ephemeral file into memory -- layers are much smaller than total memory, so this is afforable
  • Do all the random reads directly from this in memory buffer instead of using blob IO/page cache/disk reads.
  • Add a semaphore to limit how many timelines may concurrently do this (limit peak memory). Set this to ~the number of cores, or some factor of the system memory / layer size, which ever is lower.

Impl

Follow-ups:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions