feat(fuse): async read-ahead buffer for smoother FUSE throughput#493
Merged
feat(fuse): async read-ahead buffer for smoother FUSE throughput#493
Conversation
FUSE reads were blocking synchronously on NNTP segment downloads, causing 66.7% of transfer time to be spent stalling (193 gaps >100ms in a 2.3GB transfer). This was because each 16KB FUSE read directly blocked on UsenetReader segment availability with no buffering layer. Add AsyncReadBuffer that wraps ReadAtContext with a background goroutine continuously filling a ring buffer. FUSE reads now pull from pre-filled memory instead of blocking on network I/O, matching rclone's AsyncReader pattern. Key design decisions: - Buffer lives at FUSE level (both cgofuse + hanwen), not affecting WebDAV - Lazy initialization: buffer/goroutine only allocated on first read - Size threshold: only files > buffer size get buffered (skips Finder metadata) - Configurable via async_buffer_size in FuseConfig (default 8MB, 0 to disable) - 1MB fill chunks to reduce mutex overhead (1 lock per ~segment) Results: FUSE read speed improved from ~24 MB/s to ~41 MB/s (within 4% of rclone mount speed), with dramatically fewer stalls.
javi11
added a commit
that referenced
this pull request
Apr 14, 2026
…493) Add AsyncReadBuffer at the FUSE layer that wraps ReadAtContext with a background goroutine filling a ring buffer. FUSE reads pull from pre-filled memory instead of blocking on NNTP segment downloads. Before: ~24 MB/s with 66% stall time (reads blocked on segment downloads) After: ~41 MB/s, matching rclone mount speed within 4% Key design: - Ring buffer (8MB default) with background fill goroutine - Seek-aware: non-sequential reads reset buffer via generation counter - Lazy initialization: no memory allocated until first read - Size threshold: only files > buffer size get buffered (skips Finder) - Lives at FUSE level only — WebDAV path unaffected - Configurable via async_buffer_size in FuseConfig (0 to disable) - Frontend UI added for the new setting Also: - Reduce default max_prefetch from 60 to 30 segments (async buffer covers the gap, halves segment memory per reader) - Remove WarmUp() which raced with the async buffer
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
AsyncReadBufferat the FUSE layer that wrapsReadAtContextwith a background goroutine filling a ring bufferasync_buffer_sizein FUSE config (default 8MB, 0 to disable)Problem
FUSE reads blocked synchronously on NNTP segment downloads. Profiling showed 66.7% of transfer time was stalling with 193 gaps >100ms during a 2.3GB file transfer. During active reads throughput was 111 MB/s, but the pipeline stalled whenever reads caught up to the download frontier.
Solution
Inspired by rclone's
AsyncReader(which provides smooth reads even without VFS cache), this adds anAsyncReadBufferthat:Results
catspeedcatspeedTest plan
-racefor AsyncReadBuffer (sequential reads, slow source, error propagation, concurrent read+close, passthrough, GetBufferedOffset)go build ./...passesdd if=<fuse-file> of=/dev/null bs=1m— verify ~40+ MB/s