Skip to content

[WIP][cpp] Add writer option to flush data to disk every N bytes#1606

Draft
MichaelOrlov wants to merge 1 commit into
foxglove:mainfrom
MichaelOrlov:morlov/add-option-to-flush-data-to-disk-for-n-bytes
Draft

[WIP][cpp] Add writer option to flush data to disk every N bytes#1606
MichaelOrlov wants to merge 1 commit into
foxglove:mainfrom
MichaelOrlov:morlov/add-option-to-flush-data-to-disk-for-n-bytes

Conversation

@MichaelOrlov
Copy link
Copy Markdown
Contributor

Changelog

Add a new sizeForFsynch MCAP writer option to periodically flush file-backed writes to persistent storage after every configured number of bytes.

Docs

None

Description

This change adds a new writer option, sizeForFsynch, to allow explicit flushing of file-backed MCAP output to disk after every configured number of written bytes.

Today, once data leaves the MCAP library and is written to a file, it can still remain buffered in OS-managed DRAM before it reaches persistent storage. For recording workloads, that creates a durability gap: if the system crashes or loses power, recently written data may still be lost even though it has already passed through the MCAP writer.

This PR adds an opt-in API to reduce that risk. When sizeForFsynch is greater than 0, the file-backed writer tracks how many bytes have been written since the last sync and calls fsync(fd) once the configured threshold is met or exceeded. A value of 0 preserves the current behavior and leaves flush timing entirely to the operating system.

The intent is to give users an explicit durability/performance tradeoff. Applications that prioritize throughput can keep the default behavior. Applications that prioritize minimizing data loss during recording can choose a threshold appropriate for their workload and storage characteristics.

This is especially relevant for long-running or safety-critical recordings where losing the last buffered portion of the output is undesirable. Background reference: Ensuring data reaches disk

Unit tests were added to verify:

  • no explicit sync happens when the threshold is 0
  • sync is triggered when accumulated bytes reach the threshold
  • McapWriter correctly forwards the option to the file-backed writer path

Manual testing:

  • compiled the C++ unit test binary with a direct local g++ build
  • ran the sync-focused tests and confirmed they passed
  • ran the full unit test binary and confirmed all tests passed in the local verification build
BeforeAfter

MCAP file writes relied entirely on OS flush behavior. Data written by the library could still remain in kernel buffers for an unspecified amount of time before reaching persistent storage.

MCAP exposes sizeForFsynch so callers can request explicit periodic fsync(fd) calls after every configured number of bytes written, reducing the window of potential data loss during recording.

Add `sizeForFsynch` to `McapWriterOptions` to allow explicit file syncs after
a configurable amount of data has been written.

When the option is greater than `0`, the file-backed writer tracks bytes written
since the last sync and calls `fsync(fd)` once the configured threshold is met
or exceeded. A value of `0` keeps the current behavior and leaves flush timing
to the operating system.

This provides an opt-in durability control for workloads that need more frequent
data persistence, with the tradeoff that `fsync(fd)` is blocking and may reduce
write throughput.

Signed-off-by: Michael Orlov <morlovmr@gmail.com>
@MichaelOrlov MichaelOrlov marked this pull request as draft March 20, 2026 04:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant