pageserver: do vectored read on each dio-aligned section once #8763

yliang412 · 2024-08-20T04:18:17Z

Part of #8130, closes #8719.

Problem

Currently, vectored blob io only coalesce blocks if they are immediately adjacent to each other. When we switch to Direct IO, we need a way to coalesce blobs that are within the dio-aligned boundary but has gap between them.

Summary of changes

Introduces a VectoredReadCoalesceMode for VectoredReadPlanner and StreamingVectoredReadPlanner which has two modes:
- AdjacentOnly (current implementation)
- Chunked(<alignment requirement>)
New ChunkedVectorBuilder that considers batching dio-align-sized read, the start and end of the vectored read will respect stx_dio_offset_align / stx_dio_mem_align (vectored_read.start and vectored_read.blobs_at.first().start_offset will be two different value).
Since we break the assumption that blobs within single VectoredRead are next to each other (implicit end offset), we start to store blob end offsets in the VectoredRead.
Adapted existing tests to run in both VectoredReadCoalesceMode.
The io alignment can also be live configured at runtime.

Testing

See #8779 for a matrix build of the regression test with alignment requirement = [1, 512].

Performance

Benchmark Results

TLDR: No significant difference between using different chunk sizes.

Rollout

The adjacent-only merge is enabled by default (io_buffer_alignment=0).
Run Rust unittest running with alignment requirement = [0, 1, 512].

We will test the new chunked vectored read code path in pre-prod later this week after release.

Checklist before requesting a review

I have performed a self-review of my code.
If it is a core feature, I have added thorough tests.
Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

Do not forget to reformat commit message to not include the above checklist

github-actions · 2024-08-20T04:37:34Z

3780 tests run: 3674 passed, 0 failed, 106 skipped (full report)

Code coverage* (full report)

functions: 32.3% (7328 of 22654 functions)
lines: 50.4% (59285 of 117523 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
2443e9a at 2024-08-28T14:18:55.634Z :recycle:}

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

VladLazar

Generally looks good

Are we supporting these two modes to allow for a staged cut-over to dio?

pageserver/src/tenant/vectored_blob_io.rs

pageserver/src/virtual_file.rs

yliang412 · 2024-08-21T13:42:40Z

Are we supporting these two modes to allow for a staged cut-over to dio?

Currently doing perf testing to see if this is mergable without the actual O_DIRECT changes. I will report back findings once I'm done.

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

problame

This review pertains to code that is taken by the Adjacent path, i.e., this review is to ensure we're not regressing anything.

pageserver/src/tenant/vectored_blob_io.rs

problame

Reviewed the ChunkedVectoredReadBuilderInner.

Seems correct, some style comments.

pageserver/src/tenant/vectored_blob_io.rs

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

Part of #8130, closes #8719. ## Problem Currently, vectored blob io only coalesce blocks if they are immediately adjacent to each other. When we switch to Direct IO, we need a way to coalesce blobs that are within the dio-aligned boundary but has gap between them. ## Summary of changes - Introduces a `VectoredReadCoalesceMode` for `VectoredReadPlanner` and `StreamingVectoredReadPlanner` which has two modes: - `AdjacentOnly` (current implementation) - `Chunked(<alignment requirement>)` - New `ChunkedVectorBuilder` that considers batching `dio-align`-sized read, the start and end of the vectored read will respect `stx_dio_offset_align` / `stx_dio_mem_align` (`vectored_read.start` and `vectored_read.blobs_at.first().start_offset` will be two different value). - Since we break the assumption that blobs within single `VectoredRead` are next to each other (implicit end offset), we start to store blob end offsets in the `VectoredRead`. - Adapted existing tests to run in both `VectoredReadCoalesceMode`. - The io alignment can also be live configured at runtime. Signed-off-by: Yuchen Liang <yuchen@neon.tech>

yliang412 changed the title ~~Yuchen/vectored read chunk coalesce~~ [WIP] vectored read chunk coalesce Aug 20, 2024

github-actions bot added the external A PR or Issue is created by an external user label Aug 20, 2024

yliang412 removed the external A PR or Issue is created by an external user label Aug 20, 2024

yliang412 changed the title ~~[WIP] vectored read chunk coalesce~~ pageserver: do vectored read on each dio-aligned section once Aug 20, 2024

yliang412 self-assigned this Aug 20, 2024

yliang412 added the c/storage/pageserver Component: storage: pageserver label Aug 20, 2024

yliang412 marked this pull request as ready for review August 20, 2024 14:30

yliang412 requested a review from a team as a code owner August 20, 2024 14:30

yliang412 requested review from skyzh, problame and VladLazar and removed request for a team August 20, 2024 14:30

yliang412 added 2 commits August 20, 2024 14:56

pageserver: do vectored read on each dio-aligned section once

ab97ad5

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

fix clippy

ea5efeb

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

yliang412 force-pushed the yuchen/vectored-read-chunk-coalesce branch from 0af8e83 to ea5efeb Compare August 20, 2024 19:01

yliang412 requested review from a team as code owners August 20, 2024 19:01

yliang412 requested review from cloneable and hlinnaka and removed request for a team August 20, 2024 19:01

yliang412 changed the base branch from problame/inmemory-layer-offset-u32 to main August 20, 2024 19:01

yliang412 removed request for cloneable and hlinnaka August 20, 2024 19:02

yliang412 marked this pull request as draft August 20, 2024 19:02

log io buffer alignment setting at pageserver startup

674460d

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

VladLazar reviewed Aug 21, 2024

View reviewed changes

pageserver/src/virtual_file.rs Show resolved Hide resolved

yliang412 and others added 3 commits August 22, 2024 23:09

trying out 512 bytes io alignment as default

817981a

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

Merge branch 'main' into yuchen/vectored-read-chunk-coalesce

d4f2c67

stonger checks; fix copy_delta_prefix_smoke by getting coalesce mode

84f770b

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

problame reviewed Aug 23, 2024

View reviewed changes

pageserver/src/tenant/vectored_blob_io.rs Outdated Show resolved Hide resolved

pageserver/src/tenant/vectored_blob_io.rs Outdated Show resolved Hide resolved

yliang412 added 6 commits August 25, 2024 16:07

review: put common initialization logic into new_impl

124c97f

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

review: rename vector read builders; fix typo

d0fd9b5

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

review(style): make used-only-once is_adjacent_chunk_read a variable

9213dc3

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

review(style): make not_limited_by_max_read_size a variable

e5bbece

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

review(read_blobs): use parsed blob_size to determine end offset

70dc350

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

update comments

fe2393c

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

yliang412 force-pushed the yuchen/vectored-read-chunk-coalesce branch from dff338c to fe2393c Compare August 25, 2024 20:31

Merge branch 'main' into yuchen/vectored-read-chunk-coalesce

9b28cca

problame approved these changes Aug 26, 2024

View reviewed changes

yliang412 mentioned this pull request Aug 26, 2024

pageserver: direct I/O #8130

Open

yliang412 and others added 5 commits August 26, 2024 16:37

use io_buffer_alignment=0 for adjacent-only read builder; set default

bf71a7a

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

add ability to specify alignment in test

b1fe356

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

add ability to live config alignment

8e59532

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

Merge branch 'main' into yuchen/vectored-read-chunk-coalesce

7bfe353

fix vendor

0c6e122

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

yliang412 marked this pull request as ready for review August 26, 2024 21:33

yliang412 and others added 5 commits August 26, 2024 20:51

update alignment check condition to be is_power_of_two_or_zero

686c4f6

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

fix error around using alignment=0

9663b8f

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

Merge branch 'main' into yuchen/vectored-read-chunk-coalesce

d45492e

make io alignment an optional config param in fixtures

01a2288

Signed-off-by: Yuchen Liang <yuchen@neon.tech>

Merge branch 'main' into yuchen/vectored-read-chunk-coalesce

2443e9a

yliang412 enabled auto-merge (squash) August 28, 2024 13:11

yliang412 merged commit a889a49 into main Aug 28, 2024
70 checks passed

yliang412 deleted the yuchen/vectored-read-chunk-coalesce branch August 28, 2024 14:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pageserver: do vectored read on each dio-aligned section once #8763

pageserver: do vectored read on each dio-aligned section once #8763

yliang412 commented Aug 20, 2024 •

edited

Loading

github-actions bot commented Aug 20, 2024 •

edited

Loading

VladLazar left a comment •

edited

Loading

yliang412 commented Aug 21, 2024

problame left a comment

problame left a comment

pageserver: do vectored read on each dio-aligned section once #8763

pageserver: do vectored read on each dio-aligned section once #8763

Conversation

yliang412 commented Aug 20, 2024 • edited Loading

Problem

Summary of changes

Testing

Performance

Rollout

Checklist before requesting a review

Checklist before merging

github-actions bot commented Aug 20, 2024 • edited Loading

3780 tests run: 3674 passed, 0 failed, 106 skipped (full report)

Code coverage* (full report)

VladLazar left a comment • edited Loading

Choose a reason for hiding this comment

yliang412 commented Aug 21, 2024

problame left a comment

Choose a reason for hiding this comment

problame left a comment

Choose a reason for hiding this comment

yliang412 commented Aug 20, 2024 •

edited

Loading

github-actions bot commented Aug 20, 2024 •

edited

Loading

VladLazar left a comment •

edited

Loading