-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Lumen snapshot invariant #5059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Lumen snapshot invariant #5059
Conversation
| } | ||
|
|
||
| std::string | ||
| ConservationOfLumens::checkSnapshot( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that's important to note here - ConservationOfLumens isn't a strict invariant, so it won't take down the node. Instead, it'll log an error that our alerting will pick up. If we're not fine with this, we'll either need to make this invariant strict, or create a new invariant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the invariant is still very likely to fail in this case since the balance would have to match (by luck) but I agree this should probably be handled more precisely by passing in another out-parameter for an error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you meant to comment here right? I agree and updated that code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements a snapshot-based invariant for conservation of lumens to validate that the total native asset supply matches the totalCoins value in the ledger header. The implementation scans both live and hot archive buckets to calculate the sum of all native balances and compares it against the recorded total.
Key changes:
- Adds
checkSnapshotmethod toConservationOfLumensinvariant that scans bucket lists - Introduces
isStoppingcallback parameter to allow early termination during application shutdown - Adds test method
getLastClosedLedgerStateForTestingto access ledger state for testing
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/main/AppConnector.h/cpp | Adds isStopping() method to expose application shutdown state |
| src/ledger/LedgerManager.h | Adds getLastClosedLedgerStateForTesting() virtual method declaration |
| src/ledger/LedgerManagerImpl.h/cpp | Implements getLastClosedLedgerStateForTesting() for test access and updates snapshot invariant call with isStopping callback |
| src/invariant/InvariantManager.h | Updates runStateSnapshotInvariant signature to include isStopping parameter |
| src/invariant/InvariantManagerImpl.h/cpp | Forwards isStopping callback to individual invariant checks |
| src/invariant/Invariant.h | Adds isStopping parameter to checkSnapshot virtual method |
| src/invariant/ConservationOfLumens.h/cpp | Implements snapshot validation by scanning live and hot archive buckets to sum native balances |
| src/invariant/ArchivedStateConsistency.h/cpp | Updates signature to accept isStopping parameter (unused in implementation) |
| src/invariant/test/ConservationOfLumensTests.cpp | Adds test case validating the snapshot invariant correctly detects mismatches in total lumens |
26c82e7 to
bbd59fa
Compare
SirTyson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I think it looks good overall, just a few small nits and a suggestion on a memory/IO optimization. I think we should also improve the test to verify a non-empty and corrupt archive.
| } | ||
|
|
||
| // Check if we should stop before scanning hot archive | ||
| if (isStopping()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I don't think this is necessary, since we check isStopping on the first iteration of scanHotArchiveBucket anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I must've wiped the change during a rebase. Fixed.
|
|
||
| // Note: We should figure out how to force the eviction of an SAC contract | ||
| // balance. The SAC auto extends balances, so simply waiting for TTL to | ||
| // expire is not practical. We need to update the in memory ttl to a lower |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this is a BucketList specific invariant, I think we can test it by just manually populating the BucketList like we do in the "State archival eviction invariant" test. It's pretty easy to simulate a bad archival by just adding a DEADENTRY for the key in the live BucketList and an ARCHIVED key with the wrong value to the archive via the BucketTestUtils::BucketTestApplication interface. This test harness also automatically updates the in-memory state as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test. I verified that it's triggering correctly in the test, but let me know if the code to populate the bucket can be improved.
| } | ||
|
|
||
| std::string | ||
| ConservationOfLumens::checkSnapshot( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the invariant is still very likely to fail in this case since the balance would have to match (by luck) but I agree this should probably be handled more precisely by passing in another out-parameter for an error message.
graydon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the one nit same as Garand, otherwise LGTM
Previously, processEntryIfNew would log overflow errors but allow the invariant to continue, causing it to pass despite data corruption. Now passes error string by reference and returns it from the invariant, causing proper failure on overflow. Changes: - Add errorMsg parameter to processEntryIfNew - Remove CLOG_ERROR calls, use fmt::format to set errorMsg - Update scanLiveBucket and scanHotArchiveBucket to accept and pass errorMsg - Check errorMsg after scan phases and return immediately if set 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Allows early termination during shutdown, preventing unnecessary work. Changes: - Check isStopping at start of checkIfLiveEntryInArchive lambda - Check isStopping between type iterations - Return empty string (not error) when stopping 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add comprehensive bucket-level test using BucketTestApplication pattern. Tests three scenarios: 1. Invariant passes with correct bucket state 2. Invariant fails when bucket balance doesn't match totalCoins 3. Invariant handles shadowing correctly (LIVE entry shadows INIT entry) Addresses feedback from PR review to test actual bucket corruption rather than just modifying totalCoins. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
SirTyson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, with a few small cleanups in the unit test.
| app->getLedgerManager().getLastClosedLedgerStateForTesting(); | ||
| auto& inMemoryState = | ||
| app->getLedgerManager().getInMemorySorobanStateForTesting(); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to check that the test passes with a populated Hot Archive before corrupting it, as they other test case does not have any hot archive populated state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point. Added a second balance that is correctly archived.
| acc1Modified.data.account().balance = 5000; | ||
|
|
||
| // Add modified acc1 as LIVE entry (shadows the INIT entry) | ||
| lm.setNextLedgerEntryBatchForBucketTesting( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will actually create a shadow, since we'll just merge into the current snap immediately. I think if we close a few more ledgers in between the INIT and LIVE events (like 4 ledgers) we can guarantee an actual shadow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point. Updated.
| auto contractAddr = makeContractAddress(sha256("contract")); | ||
| REQUIRE(client.transfer(a1, contractAddr, 100)); | ||
|
|
||
| // Note: We should figure out how to force the eviction of an SAC contract |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment can be removed now right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup removed
| // Add entries to live bucket | ||
| lm.setNextLedgerEntryBatchForBucketTesting({acc1, acc2}, // init entries | ||
| {}, // live entries | ||
| {}); // dead entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I don't think :: is required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
|
|
||
| // Remove from live buckets and inject corrupted entry into hot archive | ||
| lm.setNextLedgerEntryBatchForBucketTesting({}, {}, {balanceKey}); | ||
| lm.setNextArchiveBatchForBucketTesting({balanceEntry}, {}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: for correctness, we should also delete the TTL entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, although let me know if I did that right.
Description
Resolves #5028
Checklist
clang-formatv8.0.0 (viamake formator the Visual Studio extension)