Append commit instead of individual transactions to commitlog by kim · Pull Request #4140 · clockworklabs/SpacetimeDB

kim · 2026-01-27T14:59:07Z

Changes the commitlog (and durability) write API, such that the caller decides how many transactions are in a single commit, and has to supply the transaction offsets.

This simplifies commitlog-side buffering logic to essentially a BufWriter (which, of course, we must not forget to flush). This will help throughput, but offers less opportunity to retry failed writes. This is probably a good thing, as disks can fail in erratic ways, and we should rather crash and re-verify the commitlog (suffix) than continue writing.

To that end, this patch liberally raises panics when there is a chance that internal state could be "poisoned" by partial writes, which may be debatable.

Motivation

The main motivation is to avoid maintaining the transaction offset in two places in such a way that they could diverge. As ordering commits is the responsibility of the datastore, we make it authoritative on this matter -- the commitlog will still check that offsets are contiguous, and refuse to commit if that's not the case.

A secondary, related motivation is the following:

A "commit" is an atomic unit of storage, meaning that a torn (partial) write of a commit will render the entire commit corrupt. There hasn't been a compelling case where we would want this, and have always configured the server to write exactly one transaction per commit.
The code to handle buffering of transactions is, however, rather complex, as it tries hard to allow the caller to retry writes at commit boundaries. An unfortunate consequence of this is that we'd flush to the OS very often, leaving throughput performance on the table.

So, if there is a compelling case for batching multiple transactions in a commit, it should be the datastore's responsibility.

API and ABI breaking changes

Breaks internal APIs only.

Expected complexity level and risk

5 - Mostly for the risk

Testing

Existing tests.

This moves the following responsibilities to the datastore: - maintenance of the transaction offset - deciding how many transactions are in a commit

Allowing to restore `Committed` return

kim · 2026-01-28T16:25:02Z

Nominating @gefjon and @Centril because they appeared in the reviewer suggestions.
@Centril specifically for hints on how the get rid of Box<[_]> allocations.
@Shubham8287 because of previous work on the commitlog.
@joshua-spacetime because of suggesting the change initially.

kim · 2026-01-28T16:28:22Z

crates/commitlog/src/tests/partial.rs

+/// Tests that, when a partial write occurs, we can read all flushed commits
+/// up until the faulty one.
 #[test]
-fn reopen() {


I'm not sure what this was supposed to test originally, so I removed it.

kim · 2026-01-28T16:29:04Z

crates/commitlog/src/tests/partial.rs

+/// up until the faulty one.
 #[test]
-fn reopen() {
+fn read_log_up_to_partial_write() {


This is basically the test previously named reopen.

kim · 2026-01-28T16:57:11Z

crates/commitlog/src/commitlog.rs

+        let writer = &mut self.head;
+        let committed = writer.commit(transactions)?;
+        if writer.len() >= self.opts.max_segment_size {
+            self.flush().expect("failed to flush segment");


This seemed a bit surprising to me at first -- but the BufWriter has no way of knowing how many bytes did make it. So if flush fails, the buffer is basically garbage.

Centril · 2026-01-28T17:04:35Z

@Centril specifically for hints on how the get rid of Box<[_]> allocations.

What is the typical length of the slice?

kim · 2026-01-28T18:33:02Z

@Centril specifically for hints on how the get rid of Box<[_]> allocations.

What is the typical length of the slice?

Well, 1 :D
One way to solve it is to allow only a single transaction in the Durability trait until we actually need more than one. The impl IntoIterator in the commitlog crate is supposedly going to be optimized away then.

Centril · 2026-01-28T20:37:04Z

@Centril specifically for hints on how the get rid of Box<[_]> allocations.

What is the typical length of the slice?

Well, 1 :D One way to solve it is to allow only a single transaction in the Durability trait until we actually need more than one. The impl IntoIterator in the commitlog crate is supposedly going to be optimized away then.

Yeah, I think this is the right call until we need something else.

Shubham8287

This looks code, simplifies commitlog's writes API alot.

I wonder, if we should test replication with this branch? To surface any bug, if exist for n!=1 case.

kim · 2026-01-29T10:34:43Z

I wonder, if we should test replication with this branch? To surface any bug, if exist for n!=1 case.

We will benefit from BufWriter buffering without allowing n > 1 in the Durability trait (just flush after n commit() calls). So I'm considering to revert the Durability change to

fn append_tx(&self, Transaction<Self::TxData>)

which requires the offset to be supplied, but doesn't allocate a Box<[_]> that we don't actually need currently.

We don't really need batched transactions at the moment, so avoid the boxed array allocation. Durability::append_tx takes a `Transaction`, though, requiring the offset to be supplied by the datastore.

…d-commit

kim · 2026-01-30T11:09:13Z

Note that Rust does not by default run destructors when the program is terminated by a signal (any signal). This, and the default being unconfirmed reads, are the reason the commitlog before this patch would flush after every write.

I added a config option to preserve this behavior. Question is if we should make it the default for standalone. (We should probably also make use of the ctrlc crate for server processes)

kim added 11 commits January 27, 2026 15:56

Append commit instead of individual transactions to commitlog

06f9c2e

This moves the following responsibilities to the datastore: - maintenance of the transaction offset - deciding how many transactions are in a commit

Restore some commentary

a0de7d9

Clear commit before returning error

e73abcc

More commentary

d5b29cc

Panic if >u16::MAX transactions

de918d5

Allowing to restore `Committed` return

Docs

f722edb

set_epoch doesn't need to flush

550aa4b

Return Committed from all commit methods

876f07b

Docs

0cf8def

Use assert

a50b422

Restore the commit corruption after ENOSPC test

fd27477

kim linked an issue Jan 28, 2026 that may be closed by this pull request

Make datastore responsible for maintaining the transaction offset #4125

Open

kim added 2 commits January 28, 2026 17:17

Add TODO

f37fad3

Fix fallocate tests

4a16bf5

kim requested review from Centril, Shubham8287, gefjon and joshua-spacetime January 28, 2026 16:22

kim marked this pull request as ready for review January 28, 2026 16:25

kim commented Jan 28, 2026

View reviewed changes

Shubham8287 approved these changes Jan 29, 2026

View reviewed changes

kim added 2 commits January 29, 2026 12:35

Revert durability trait changes

0ad5bdf

We don't really need batched transactions at the moment, so avoid the boxed array allocation. Durability::append_tx takes a `Transaction`, though, requiring the offset to be supplied by the datastore.

Merge remote-tracking branch 'origin/master' into kim/commitlog/appen…

6450e54

…d-commit

kim added 3 commits January 29, 2026 19:50

Make default sync interval much smaller

c31efd0

Add option to flush after each tx (previous behaviour)

a582f67

Merge remote-tracking branch 'origin/master' into HEAD

f26a404

kim mentioned this pull request Jan 30, 2026

[WIP] commitlog: Manually flush the inner BufWriter #3839

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Append commit instead of individual transactions to commitlog#4140

Append commit instead of individual transactions to commitlog#4140
kim wants to merge 18 commits intomasterfrom
kim/commitlog/append-commit

kim commented Jan 27, 2026 •

edited

Loading

Uh oh!

kim commented Jan 28, 2026

Uh oh!

kim Jan 28, 2026

Uh oh!

kim Jan 28, 2026

Uh oh!

kim Jan 28, 2026

Uh oh!

Centril commented Jan 28, 2026

Uh oh!

kim commented Jan 28, 2026

Uh oh!

Centril commented Jan 28, 2026

Uh oh!

Shubham8287 left a comment

Uh oh!

kim commented Jan 29, 2026

Uh oh!

kim commented Jan 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kim commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

API and ABI breaking changes

Expected complexity level and risk

Testing

Uh oh!

kim commented Jan 28, 2026

Uh oh!

kim Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

kim Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

kim Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Centril commented Jan 28, 2026

Uh oh!

kim commented Jan 28, 2026

Uh oh!

Centril commented Jan 28, 2026

Uh oh!

Shubham8287 left a comment

Choose a reason for hiding this comment

Uh oh!

kim commented Jan 29, 2026

Uh oh!

kim commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kim commented Jan 27, 2026 •

edited

Loading

kim commented Jan 30, 2026 •

edited

Loading