Skip to content

feat(flatkv): add snapshot, WAL catchup, and rollback support#2972

Open
blindchaser wants to merge 21 commits intomainfrom
yiren/flatkv-snapshot
Open

feat(flatkv): add snapshot, WAL catchup, and rollback support#2972
blindchaser wants to merge 21 commits intomainfrom
yiren/flatkv-snapshot

Conversation

@blindchaser
Copy link
Contributor

Describe your changes and provide context

Introduce a snapshot-based lifecycle for FlatKV so that restarts replay
only from the nearest PebbleDB checkpoint instead of the full WAL.

Key changes:

  • Snapshot management: immutable PebbleDB checkpoints created via
    Checkpoint(), managed through a "current" symlink and atomic
    directory operations. Configurable interval, retention, and
    minimum time between snapshots.
  • Working directory: mutable clone of the baseline snapshot (hardlinks
    for .sst files) so writes never mutate snapshot dirs.
  • WAL catchup: on open, replay changelog entries from the snapshot
    version to the target version using O(1) arithmetic + O(log N)
    binary search for offset resolution.
  • Rollback: rewind to the best snapshot <= target, truncate WAL,
    replay to exact version, and prune future snapshots.
  • File lock: prevent concurrent access from multiple processes.
  • Migration: automatically move pre-snapshot flat layout into a
    versioned snapshot directory on first open.
  • Auto WAL truncation: periodically discard WAL entries older than
    the earliest retained snapshot.
  • Fix account LtHash baseline capture to use pre-batch state when
    multiple ApplyChangeSets calls precede a single Commit.
  • Add legacyDB to flushAllDBs.
  • Mark Iterator/IteratorByPrefix as EXPERIMENTAL (unused in production).

Testing performed to validate your change

@github-actions
Copy link

github-actions bot commented Feb 24, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedFeb 25, 2026, 3:35 PM

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bccc292045

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +521 to +523
if err := s.open(); err != nil {
return fmt.Errorf("open for rollback: %w", err)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reclone baseline snapshot before rollback catchup

When Rollback updates current and immediately calls open, it does not invalidate working/SNAPSHOT_BASE, so after a restart (where working was already cloned from that same snapshot) createWorkingDir reuses a working DB that may already be at a higher version than targetVersion. In that case catchup skips all entries (entry.Version <= committedVersion), rollback fails with a version mismatch, and this can happen after the WAL has already been truncated, leaving rollback in a partially-mutated state. Force a fresh clone of working from the selected snapshot before opening during rollback (like LoadVersion(target>0) already does).

Useful? React with 👍 / 👎.

s.pruneSnapshots(dir, version)

success = true
s.lastSnapshotTime = time.Now()

Check warning

Code scanning / CodeQL

Calling the system time Warning

Calling the system time may be a possible source of non-determinism
@codecov
Copy link

codecov bot commented Feb 24, 2026

Codecov Report

❌ Patch coverage is 64.27406% with 219 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.84%. Comparing base (3821195) to head (075dc71).

Files with missing lines Patch % Lines
sei-db/state_db/sc/flatkv/snapshot.go 67.58% 57 Missing and 49 partials ⚠️
sei-db/state_db/sc/flatkv/store.go 59.23% 30 Missing and 23 partials ⚠️
sei-db/state_db/sc/flatkv/store_catchup.go 58.33% 23 Missing and 17 partials ⚠️
sei-db/state_db/sc/flatkv/store_write.go 46.42% 8 Missing and 7 partials ⚠️
sei-db/state_db/sc/flatkv/store_lifecycle.go 76.47% 2 Missing and 2 partials ⚠️
sei-db/state_db/sc/flatkv/store_read.go 50.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #2972       +/-   ##
===========================================
+ Coverage   58.11%   66.84%    +8.73%     
===========================================
  Files        2109       25     -2084     
  Lines      173402     1903   -171499     
===========================================
- Hits       100765     1272    -99493     
+ Misses      63684      432    -63252     
+ Partials     8953      199     -8754     
Flag Coverage Δ
sei-chain 65.96% <64.27%> (+7.88%) ⬆️
sei-db 69.50% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-db/db_engine/pebbledb/db.go 93.50% <100.00%> (+0.35%) ⬆️
sei-db/state_db/sc/flatkv/config.go 100.00% <100.00%> (ø)
sei-db/state_db/sc/flatkv/iterator.go 39.65% <100.00%> (+1.63%) ⬆️
sei-db/state_db/sc/flatkv/keys.go 100.00% <ø> (+4.70%) ⬆️
sei-db/state_db/sc/flatkv/store_meta.go 67.56% <100.00%> (ø)
sei-db/state_db/sc/flatkv/store_read.go 57.14% <50.00%> (+3.23%) ⬆️
sei-db/state_db/sc/flatkv/store_lifecycle.go 52.38% <76.47%> (+3.99%) ⬆️
sei-db/state_db/sc/flatkv/store_write.go 67.96% <46.42%> (-3.01%) ⬇️
sei-db/state_db/sc/flatkv/store_catchup.go 58.33% <58.33%> (ø)
sei-db/state_db/sc/flatkv/store.go 60.10% <59.23%> (-3.94%) ⬇️
... and 1 more

... and 2086 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines +58 to +60
// Checkpointable is an optional capability for DB engines that support
// efficient point-in-time snapshots via filesystem hardlinks.
type Checkpointable interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you augment the godoc with information about concurrency? For example, is it safe to call this method concurrently with updates in other threads? When this method returns, is the checkpoint capable of surviving a host OS crash?

Comment on lines +20 to +24
snapshotPrefix = "snapshot-"
snapshotDirLen = len(snapshotPrefix) + 20

currentLink = "current"
currentTmpLink = "current-tmp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short descriptions for what each constant is for might be helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to document this, feel free to push back if you think it's too much overhead.

At past companies, when documenting this sort of file layout, I sometimes find it useful to do the following:

  • write a simple unit test that generates a basic file structure
  • pause the unit test before it deletes its data
  • run tree on the directory created by the test
  • edit the result for readability copy-paste the rest somewhere

Here's an example: https://github.com/Layr-Labs/eigenda/blob/master/litt/docs/filesystem_layout.md

Image

If the file layout is to big, it might make sense to split it out into a markdown file that you can just reference in the godoc.

// not a full path.
func updateCurrentSymlink(root, snapshotDir string) error {
tmpPath := filepath.Join(root, currentTmpLink)
_ = os.Remove(tmpPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if this removal fails due to invalid file permissions? I'm guessing you aren't checking the error in case it fails due to the path not existing. Perhaps this can first check if the file exists, delete it if it exists, and then return an error if that deletion fails.

Comment on lines +192 to +196
// flatkv/
// current -> snapshot-NNNNN
// snapshot-NNNNN/{account,code,...}/ (immutable)
// working/{account,code,...}/ (mutable clone)
// changelog/ (WAL, shared)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, didn't see this when I made my comment above. It might be nice to replicate this to the other file that deals with directory names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants