Skip to content

Enable restoring an earlier snapshot if latest fails#4222

Open
pcholakov wants to merge 3 commits intoscan-sweep-snapshot-pruningfrom
snapshot-restore-fallback
Open

Enable restoring an earlier snapshot if latest fails#4222
pcholakov wants to merge 3 commits intoscan-sweep-snapshot-pruningfrom
snapshot-restore-fallback

Conversation

@pcholakov
Copy link
Contributor

@pcholakov pcholakov commented Jan 20, 2026

This change updates the partition store attempt restoring from an earlier snapshot, if the latest one fails. This could be because of missing or corrupt SSTs for example.

The PR also contains an in-tree snapshot chaos test which stresses incremental snapshots and the restore path, by killing nodes and corrupting the latest snapshot by removing some of its SSTs.

  1. Add support for managing a fixed number of retained snapshots #3942
  2. Introduce leases for snapshot coordination #4204
  3. Add incremental snapshot support #4198
  4. Add orphaned snapshot objects pruning #4212
  5. Enable restoring an earlier snapshot if latest fails #4222 ⬅️ you are here

Fixes #3930

@github-actions
Copy link

github-actions bot commented Jan 20, 2026

Test Results

  7 files  ±0    7 suites  ±0   2m 43s ⏱️ +2s
 47 tests ±0   47 ✅ ±0  0 💤 ±0  0 ❌ ±0 
200 runs  ±0  200 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit 7bcfd3c. ± Comparison against base commit 86d09a1.

♻️ This comment has been updated with latest results.

@pcholakov pcholakov changed the title Snapshot restore falls back to earlier snapshot if available Enable restoring an earlier snapshot if latest fails Jan 20, 2026
@pcholakov pcholakov force-pushed the snapshot-restore-fallback branch 3 times, most recently from 2f139b0 to 765f99f Compare January 20, 2026 20:53
@pcholakov pcholakov marked this pull request as ready for review January 20, 2026 20:53
@pcholakov pcholakov force-pushed the snapshot-restore-fallback branch from 765f99f to 56e86cb Compare January 21, 2026 07:35
@pcholakov pcholakov force-pushed the scan-sweep-snapshot-pruning branch from 896f467 to 71fca12 Compare January 21, 2026 09:31
@pcholakov pcholakov force-pushed the snapshot-restore-fallback branch from 56e86cb to 19f6c7d Compare January 21, 2026 09:31
@pcholakov pcholakov force-pushed the scan-sweep-snapshot-pruning branch from 71fca12 to cf1468b Compare January 21, 2026 10:38
@pcholakov pcholakov force-pushed the snapshot-restore-fallback branch 2 times, most recently from f3138ae to 43dea07 Compare January 22, 2026 10:13
@pcholakov pcholakov force-pushed the snapshot-restore-fallback branch from 43dea07 to 933bc78 Compare January 29, 2026 18:46
@pcholakov pcholakov force-pushed the scan-sweep-snapshot-pruning branch from cf1468b to 5560969 Compare January 29, 2026 18:46
@pcholakov pcholakov force-pushed the snapshot-restore-fallback branch from 933bc78 to 4610813 Compare February 1, 2026 10:09
@pcholakov pcholakov force-pushed the scan-sweep-snapshot-pruning branch 2 times, most recently from 78e783c to 86d09a1 Compare February 1, 2026 19:02
When the latest snapshot fails to download (network error, corrupt metadata,
missing files), the system now automatically tries older retained snapshots
in descending LSN order until one succeeds or all candidates are exhausted.
@pcholakov pcholakov force-pushed the snapshot-restore-fallback branch from 4610813 to 7bcfd3c Compare February 1, 2026 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant