[WIP][cpp] Add writer option to flush data to disk every N bytes#1606
Draft
MichaelOrlov wants to merge 1 commit into
Draft
[WIP][cpp] Add writer option to flush data to disk every N bytes#1606MichaelOrlov wants to merge 1 commit into
MichaelOrlov wants to merge 1 commit into
Conversation
Add `sizeForFsynch` to `McapWriterOptions` to allow explicit file syncs after a configurable amount of data has been written. When the option is greater than `0`, the file-backed writer tracks bytes written since the last sync and calls `fsync(fd)` once the configured threshold is met or exceeded. A value of `0` keeps the current behavior and leaves flush timing to the operating system. This provides an opt-in durability control for workloads that need more frequent data persistence, with the tradeoff that `fsync(fd)` is blocking and may reduce write throughput. Signed-off-by: Michael Orlov <morlovmr@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog
Add a new
sizeForFsynchMCAP writer option to periodically flush file-backed writes to persistent storage after every configured number of bytes.Docs
None
Description
This change adds a new writer option,
sizeForFsynch, to allow explicit flushing of file-backed MCAP output to disk after every configured number of written bytes.Today, once data leaves the MCAP library and is written to a file, it can still remain buffered in OS-managed DRAM before it reaches persistent storage. For recording workloads, that creates a durability gap: if the system crashes or loses power, recently written data may still be lost even though it has already passed through the MCAP writer.
This PR adds an opt-in API to reduce that risk. When
sizeForFsynchis greater than0, the file-backed writer tracks how many bytes have been written since the last sync and callsfsync(fd)once the configured threshold is met or exceeded. A value of0preserves the current behavior and leaves flush timing entirely to the operating system.The intent is to give users an explicit durability/performance tradeoff. Applications that prioritize throughput can keep the default behavior. Applications that prioritize minimizing data loss during recording can choose a threshold appropriate for their workload and storage characteristics.
This is especially relevant for long-running or safety-critical recordings where losing the last buffered portion of the output is undesirable. Background reference: Ensuring data reaches disk
Unit tests were added to verify:
0McapWritercorrectly forwards the option to the file-backed writer pathManual testing:
g++buildMCAP file writes relied entirely on OS flush behavior. Data written by the library could still remain in kernel buffers for an unspecified amount of time before reaching persistent storage.
MCAP exposes
sizeForFsynchso callers can request explicit periodicfsync(fd)calls after every configured number of bytes written, reducing the window of potential data loss during recording.