fix(l0_flush): drops permit before fsync, potential cause for OOMs #8327
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
Slack thread: https://neondb.slack.com/archives/C033RQ5SPDH/p1720511577862519
We're seeing OOMs in staging on a pageserver that has l0_flush.mode=Direct enabled.
There's a strong correlation between jumps in
maxrss_kb
andpageserver_timeline_ephemeral_bytes
, so, it's quite likely that l0_flush.mode=Direct is the culprit.Notably, the expected max memory usage on that staging server by the l0_flush.mode=Direct is ~2GiB but we're seeing as much as 24GiB max RSS before the OOM kill.
One hypothesis is that we're dropping the semaphore permit before all the dirtied pages have been flushed to disk. (The flushing to disk likely happens in the fsync inside the
.finish()
call, because we're using ext4 in data=ordered mode).Summary of changes
Hold the permit until after we're done with
.finish()
.