bypass PageCache for L0 flush

Currently, when we do an `InMemoryLayer::write_to_disk`, there is a tremendous amount of random read I/O, as deltas from the ephemeral file (written in LSN order) are written out to the delta layer in key order.

In benchmarks (https://github.com/neondatabase/neon/pull/7409) we can see that this delta layer writing phase is substantially more expensive than the initial ingest of data, and that within the delta layer write a significant amount of the CPU time is spent traversing the page cache.

It's _really_ slow: like tens of megabytes per second on a fast desktop CPU.

Since this is a background task whose concurrency we can limit, we can simplify and accelerate this by doing the whole thing in memory:
- Read the full ephemeral file into memory -- layers are much smaller than total memory, so this is afforable
- Do all the random reads directly from this in memory buffer instead of using blob IO/page cache/disk reads.
- Add a semaphore to limit how many timelines may concurrently do this (limit peak memory).  Set this to ~the number of cores, or some factor of the system memory / layer size, which ever is lower.
```[tasklist]
### Impl
- [ ] https://github.com/neondatabase/neon/pull/8186
- [ ] https://github.com/neondatabase/neon/pull/8190#
- [ ] https://github.com/neondatabase/aws/pull/1568
- [ ] https://github.com/neondatabase/neon/pull/8327
- [ ] https://github.com/neondatabase/aws/pull/1596
- [ ] https://github.com/neondatabase/aws/pull/1601
- [ ] https://github.com/neondatabase/aws/pull/1605
- [ ] https://github.com/neondatabase/aws/pull/1622
- [ ] https://github.com/neondatabase/aws/pull/1655
- [ ] https://github.com/neondatabase/neon/pull/8534
- [ ] https://github.com/neondatabase/azure/pull/270
- [x] gradual prod rollout
- [ ] https://github.com/neondatabase/aws/pull/1656
- [ ] https://github.com/neondatabase/aws/pull/1671
- [ ] https://github.com/neondatabase/aws/pull/1723
- [ ] https://github.com/neondatabase/aws/pull/1737
- [x] decomission mode `page-cached`
- [ ] https://github.com/neondatabase/neon/pull/8739
```

Follow-ups:
* https://github.com/neondatabase/neon/issues/8894

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bypass PageCache for L0 flush #7418

jcsp
openedon Apr 18, 2024

Impl

Assignees

Labels

Type

Projects

Milestone

Relationships

Development