-
Notifications
You must be signed in to change notification settings - Fork 247
Always prefer peer's snapshot #7314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
So this PR is either hitting or exacerbating the CI instability regarding snapshotting.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR modifies the snapshot fetching mechanism for joining nodes to always prefer the primary's snapshot over local snapshots, regardless of version. The change simplifies the snapshot selection logic by removing the "since" query parameter that previously filtered snapshots based on local state.
Key changes:
- Snapshot fetching no longer considers local snapshot versions when requesting from peers
- Backups redirect to the primary's snapshot when available, falling back to their own only when no primary exists
- Test coverage expanded with a new snapshot selection test
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/e2e_operations.py | Moved force_become_primary to module level, added test_snapshot_selection, updated test calls with new parameters |
| src/snapshots/fetch.h | Removed latest_local_snapshot parameter and "since" query logic from snapshot fetching functions |
| src/node/rpc/node_frontend.h | Added redirect-to-primary logic for non-primary nodes, added snapshot name header to responses |
| src/host/run.cpp | Removed local snapshot version check, changed error to warning when overwriting existing snapshots |
| include/ccf/http_consts.h | Added new HTTP header constant for snapshot name |
| doc/operations/ledger_snapshot.rst | Updated documentation to reflect new snapshot fetching behavior |
| doc/host_config_schema/cchost_config.json | Updated description to remove "newer" qualifier from fetch behavior |
| CHANGELOG.md | Added changelog entry for the behavior change |
Co-authored-by: Amaury Chamayou <amaury@xargs.fr>
(cherry picked from commit 516fcaa) # Conflicts: # CHANGELOG.md # src/host/run.cpp # src/node/rpc/node_frontend.h # tests/e2e_operations.py
Previously we'd compare versions of snapshots we read from disk, and only take one from the peer if it was more recent. But this still exposes us to file system sync issues with snapshots.
This changes to always take a snapshot from the peer, if one is available.
Since that might mean we already have an identically-named snapshot, we also now overwrite any existing snapshot (with a corresponding log line).