-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: verify and repair evodb diffs automatically at node startup #6999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ No Merge Conflicts DetectedThis PR currently has no conflicts with other open PRs. |
Automatically verify and repair deterministic masternode list diffs in evodb during node startup. Helps detect and fix database corruption without manual intervention. - Add DB_LIST_REPAIRED marker to track when repair is complete - Skip repair during reindex (fresh rebuild) - Add -forceevodbrepair flag to force re-verification - Shutdown gracefully on critical errors with user instructions - Run in background thread with minimal cs_main locking Co-Authored-By: Claude Code (Anthropic) <noreply@anthropic.com>
WalkthroughThree files are modified to introduce a repair-tracking mechanism for the Evo masternode list database. A new persistent repair flag ( Sequence DiagramsequenceDiagram
participant Startup as Startup Flow
participant MNMgr as CDeterministicMNManager
participant EvoDB as EvoDB (Database)
participant ChainHelper as ChainHelper
participant SpecialTx as SpecialTxMan
Startup->>MNMgr: IsRepaired()?
MNMgr->>EvoDB: Check DB_LIST_REPAIRED flag
EvoDB-->>MNMgr: Flag status
MNMgr-->>Startup: Repaired status
alt Already Repaired & !forceevodbrepair
Startup->>Startup: Skip repair, log message
else Reindexing in Progress
Startup->>MNMgr: CompleteRepair()
MNMgr->>EvoDB: Write DB_LIST_REPAIRED
MNMgr->>EvoDB: Commit & flush
else Repair Needed
Startup->>ChainHelper: Build list via callback
ChainHelper->>SpecialTx: RebuildListFromBlock
SpecialTx-->>ChainHelper: Rebuilt list
ChainHelper-->>Startup: List built
Startup->>MNMgr: RecalculateAndRepairDiffs(callback)
MNMgr->>MNMgr: Verify & repair diffs
alt Verification Errors
MNMgr-->>Startup: verification_errors
Startup->>Startup: Log errors
end
alt Repair Errors
MNMgr-->>Startup: repair_errors
Startup->>Startup: Log critical, suggest reindex
Startup->>Startup: Initiate shutdown
else Success
Startup->>MNMgr: CompleteRepair()
MNMgr->>EvoDB: Write DB_LIST_REPAIRED
MNMgr->>EvoDB: Commit & flush
Startup->>Startup: Log summary
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/evo/deterministicmns.h (1)
743-749: New repair-status API on CDeterministicMNManager looks reasonableThe addition of
IsRepaired()andCompleteRepair()is consistent with the cpp implementation and the startup repair workflow; public exposure here is appropriate.You may want to document (brief comment) that these are intended to be called only from startup/maintenance paths and that
IsRepaired()is purely a DB flag check, not a live consistency check, but that's optional.src/evo/deterministicmns.cpp (1)
1694-1706: IsRepaired/CompleteRepair semantics and error handling
IsRepaired()as a thinm_evoDb.Exists(DB_LIST_REPAIRED)wrapper is fine for a durable “have we ever completed repair” marker and matches how it’s used at startup.In
CompleteRepair():
- The transactional write +
CommitRootTransaction()pattern is consistent with other Evo DB upgrade flows.- Using
assert(false)on commit failure is stricter than the migration path (which returnsfalse). Given this is only called from startup/reindex paths, this is probably acceptable, but you might consider returning abooland letting the caller log and initiate shutdown instead of hard-asserting, for consistency withMigrateLegacyDiffs()and to keep behavior uniform across DB-maintenance routines.src/init.cpp (1)
2414-2460: Startup EvoDB repair flow is well-placed; consider a couple of small robustness tweaksThe overall flow in the
loadblkthread looks solid:
- Reindex/-reindex-chainstate: you explicitly skip repair and mark the DB as repaired via
CompleteRepair(), which avoids doing heavy work while the entire history is being rebuilt.- Normal startup:
- If
IsRepaired()and-forceevodbrepair=0, you skip repair and just log once.- Otherwise you:
- Snapshot
start_indexatDIP0003Heightandstop_indexat tip undercs_main.- Run
RecalculateAndRepairDiffswith abuild_list_functhat delegates tochain_helper->special_tx->RebuildListFromBlock, reusing the already-validated special-tx logic.- Log verification errors as warnings, treat any
repair_errorsas fatal (log +StartShutdown()), and on success callCompleteRepair()and log a concise summary with timing viaTicks().This keeps cs_main locking minimal and runs before network connections are started, so there is no concurrent chain mutation during the repair.
Two optional improvements you might consider:
- Handle the “no work to do” case explicitly
If
start_indexis null orstart_index->nHeight >= stop_index->nHeight(e.g. chain height is below DIP0003 activation on a fresh dev/regtest setup), the code silently does nothing and never sets the repaired flag. That’s not incorrect, but it means the check will rerun on every startup.You could treat this as a trivially-successful repair, e.g.:
- if (start_index && stop_index && start_index->nHeight < stop_index->nHeight) { + if (start_index && stop_index && start_index->nHeight < stop_index->nHeight) { // existing verification/repair logic... - } + } else { + LogPrintf("No masternode list diffs to verify in current height range; marking EvoDB as repaired\n"); + node.dmnman->CompleteRepair(); + }
- Defensive check for chain_helper/special_tx
The lambda assumes
node.chain_helperandnode.chain_helper->special_txare always initialized. That’s true in the normal node path afterLoadChainstate, but if any future refactoring ever makes this optional, an assert or explicit check with a clear log message here would make failures easier to diagnose.Overall, the repair logic, error handling, and placement in the startup sequence look correct and low-risk.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/evo/deterministicmns.cpp(2 hunks)src/evo/deterministicmns.h(1 hunks)src/init.cpp(3 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-01-07T18:50:44.838Z
Learnt from: knst
Repo: dashpay/dash PR: 6511
File: src/evo/deterministicmns.cpp:1369-1373
Timestamp: 2025-01-07T18:50:44.838Z
Learning: The functions `MigrateDBIfNeeded` and `MigrateDBIfNeeded2` in `src/evo/deterministicmns.cpp` are temporary and will be removed in a future version. Refactoring suggestions for these functions should be avoided.
Applied to files:
src/evo/deterministicmns.hsrc/evo/deterministicmns.cpp
📚 Learning: 2025-11-13T20:02:55.480Z
Learnt from: UdjinM6
Repo: dashpay/dash PR: 6969
File: src/evo/deterministicmns.h:441-479
Timestamp: 2025-11-13T20:02:55.480Z
Learning: In `src/evo/deterministicmns.h`, the `internalId` field in `CDeterministicMN` and the `mnInternalIdMap` in `CDeterministicMNList` are non-deterministic and used only for internal bookkeeping and efficient lookups. Different nodes can assign different internalIds to the same masternode depending on their sync history. Methods like `IsEqual()` intentionally ignore internalId mappings and only compare consensus-critical deterministic fields (proTxHash, collateral, state, etc.).
Applied to files:
src/evo/deterministicmns.hsrc/evo/deterministicmns.cpp
📚 Learning: 2025-10-28T18:36:40.263Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6923
File: src/test/util/setup_common.cpp:235-251
Timestamp: 2025-10-28T18:36:40.263Z
Learning: In `src/test/util/setup_common.cpp`, the `CEvoDB` instance in `BasicTestingSetup` is constructed with `.memory = true` flag (memory-only mode), so it does not create file handles on disk. This makes the destructor teardown order safe even if `fs::remove_all(m_path_root)` is called before `m_node.evodb.reset()`.
Applied to files:
src/evo/deterministicmns.cpp
🧬 Code graph analysis (2)
src/evo/deterministicmns.h (1)
src/evo/deterministicmns.cpp (4)
IsRepaired(1694-1694)IsRepaired(1694-1694)CompleteRepair(1696-1706)CompleteRepair(1696-1696)
src/init.cpp (2)
src/node/interfaces.cpp (1)
fReindex(623-623)src/util/time.h (1)
Ticks(81-84)
🔇 Additional comments (3)
src/evo/deterministicmns.cpp (1)
32-36: DB_LIST_REPAIRED key is consistent with existing Evo DB keyingThe new
"dmn_R1"key forDB_LIST_REPAIREDfollows the existing naming/versioning pattern (dmn_S*,dmn_D*) and is scoped to this module; no issues here.src/init.cpp (2)
79-89: New Evo includes are scoped and appropriateAdding
evo/chainhelper.handevo/specialtxman.hhere matches the new startup repair usage ofnode.chain_helper->special_txand doesn’t broaden dependencies beyond init-time wiring.
756-756: -forceevodbrepair argument wiring is correctThe new
-forceevodbrepairflag is registered consistently with other debug/test options and its help text clearly indicates behavior (“Force evodb masternode list diff verification and repair on startup, even if already repaired”).No issues with the definition; it lines up with the logic later in
AppInitMain.
PastaPastaPasta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 2e5fdd8
|
Guix Automation has began to build this PR tagged as v23.0.0-devpr6999.2e5fdd83. A new comment will be made when the image is pushed. |
|
Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6999.2e5fdd83. The image should be on dockerhub soon. |
kwvg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 2e5fdd8
nit suggestions to be considered only if other reviews request changes, else disregard
|
|
||
| bool CDeterministicMNManager::IsRepaired() const { return m_evoDb.Exists(DB_LIST_REPAIRED); } | ||
|
|
||
| void CDeterministicMNManager::CompleteRepair() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: names too reassuring, consider
IsRepaired()->GetEvoDbRepairFlag()CompleteRepair()->SetEvoDbAsRepaired()
| const auto start{SteadyClock::now()}; | ||
| // Create a callback that wraps CSpecialTxProcessor::BuildNewListFromBlock | ||
| auto build_list_func = [&node](const CBlock& block, gsl::not_null<const CBlockIndex*> pindexPrev, | ||
| const CDeterministicMNList& prevList, const CCoinsViewCache& view, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: tab alignment
… at node startup 2e5fdd8 feat: verify and repair evodb diffs automatically at node startup (UdjinM6) Pull request description: ## Issue being fixed or feature implemented Automatically verify and repair deterministic masternode list diffs in evodb during node startup. Helps detect and fix database corruption without manual intervention. ## What was done? - Add `DB_LIST_REPAIRED` marker to track when repair is complete - Skip repair during reindex (fresh rebuild) - Add `-forceevodbrepair` flag to force re-verification - Shutdown gracefully on critical errors with user instructions - Run in background thread with minimal `cs_main` locking ## How Has This Been Tested? Run a node, check logs ## Breaking Changes n/a ## Checklist: - [ ] I have performed a self-review of my own code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [ ] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ ACKs for top commit: PastaPastaPasta: utACK 2e5fdd8 kwvg: utACK 2e5fdd8 --- nit suggestions to be considered only if other reviews request changes, else disregard Tree-SHA512: 6f78e90057bd3e2f8cc7948ce8c6cd14ac0f33395b2c552d89f2abda50159ebfe1f957a2dcf59bc712a495bf12a7f796ab9f3902f90035add70475bffe37def2
736bb26 chore: bump manpages for 23.0.1 (pasta) d5c7d25 chore: bump nMinimumChainWork and defaultAssumeValid (pasta) 4f8aa71 chore: bump version to 23.0.1 (pasta) 0865b7c docs: add release notes for 23.0.1 (pasta) 2048b42 Merge #6986: test: new commandline argument -tinyblk to use blk size just 64kb instead 16Mb (pasta) 1a9b20c Merge #7013: fix: update BuildTestVectors call to adjust batch size based on output flag (pasta) 36e4679 Merge #7009: fix: include QDebug directly (pasta) 69d0c9c Merge #6999: feat: verify and repair evodb diffs automatically at node startup (pasta) ca16437 Merge #6996: perf: reduce cs_main lock scope in evodb verify/repair operations (pasta) 207526e Merge #6977: fix: bls benchmarks crash when ran independently (pasta) 226aaf4 Merge #6969: feat: add evodb verify and repair RPC commands (pasta) 92abe9b Merge #6964: perf: remove duplicated check of same key in the instant send database (pasta) 5a1ec4c Merge #6961: fix: correct BLS scheme setting in `MigrateLegacyDiffs()` when `nVersion` is present (pasta) bf653d3 Merge #6949: depends: Qt 5.15.18 (pasta) faf58cd merge bitcoin#30774: Qt 5.15.16 (Kittywhiskers Van Gogh) 6a995f5 Merge #6944: fix: HD chain encryption check ordering issue (pasta) Pull request description: ## Issue being fixed or feature implemented See commits; release v23.0.1 ## What was done? see commits ## How Has This Been Tested? ## Breaking Changes ## Checklist: _Go over all the following points, and put an `x` in all the boxes that apply._ - [ ] I have performed a self-review of my own code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ ACKs for top commit: kwvg: utACK 736bb26 Tree-SHA512: fe02c4c3e520b11af54d09c47bda94112313456f9f1cb6d14e78ef16704b2a8ec8feb80fa914c55e152db2bcac7f278291824aaa36c8079eb9e0c9bff9e554a4
fe1cff3 chore: bump release to 23.0.2 (pasta) a8f15c1 Merge #7032: fix: drop gsl usage from RebuildListFromBlock function wrapper (pasta) 736bb26 chore: bump manpages for 23.0.1 (pasta) d5c7d25 chore: bump nMinimumChainWork and defaultAssumeValid (pasta) 4f8aa71 chore: bump version to 23.0.1 (pasta) 0865b7c docs: add release notes for 23.0.1 (pasta) 2048b42 Merge #6986: test: new commandline argument -tinyblk to use blk size just 64kb instead 16Mb (pasta) 1a9b20c Merge #7013: fix: update BuildTestVectors call to adjust batch size based on output flag (pasta) 36e4679 Merge #7009: fix: include QDebug directly (pasta) 69d0c9c Merge #6999: feat: verify and repair evodb diffs automatically at node startup (pasta) ca16437 Merge #6996: perf: reduce cs_main lock scope in evodb verify/repair operations (pasta) 207526e Merge #6977: fix: bls benchmarks crash when ran independently (pasta) 226aaf4 Merge #6969: feat: add evodb verify and repair RPC commands (pasta) 92abe9b Merge #6964: perf: remove duplicated check of same key in the instant send database (pasta) 5a1ec4c Merge #6961: fix: correct BLS scheme setting in `MigrateLegacyDiffs()` when `nVersion` is present (pasta) bf653d3 Merge #6949: depends: Qt 5.15.18 (pasta) faf58cd merge bitcoin#30774: Qt 5.15.16 (Kittywhiskers Van Gogh) 6a995f5 Merge #6944: fix: HD chain encryption check ordering issue (pasta) 6fd7059 chore: mark v23 as release (pasta) ae08f53 docs: integrate 6946 release notes into final (pasta) 74a222d Merge #6946: feat: show seed on wallet creation (pasta) 877343a Merge #6943: fix: don't treat arrays/objects as string literals for composite methods (pasta) 00368fb Merge #6940: fix: reuse best clsig to avoid potential race condition (pasta) 8eceb98 Merge #6938: fix: logic error in `CheckDecryptionKey` (pasta) 3f30664 Merge #6929: fix: repair `makeseeds.py`, `getblockchaininfo[softforks]` help text, drop extra `generate`s from test, resolve macOS GID issue (pasta) 7ba4f1c Merge #6928: docs: update man pages for 23.0 (pasta) a6c1d6a Merge #6920: chore: update minimum chain work, tx stats, checkpoints, seeds and SECURITY.md for v23 (pasta) 84df1f0 Merge #6909: chore: Translations 2025-10 (pasta) a6449b1 Merge #6885: fix: improve governance/proposal dialog strings (pasta) ebf3a64 docs: typos (pasta) 4ad5533 docs: typos (pasta) f407453 doc: Replace Bitcoin Core PR references with Dash Core backport PRs (pasta) 78d0725 docs: add note on proto bump and platformban p2p (pasta) e0519c3 docs: fix whitespace errors (pasta) bc8db22 docs: minor improvements to release notes (pasta) c338511 docs: reorganize rpc updates to organize extended address changes (thephez) 700c46e style: make heading style consistent (thephez) bd636bd docs: add contributors (pasta) 6d29bc3 docs: revert changes deferred to v24 (pasta) 615f5ff docs: make the downgrade warning more confident (pasta) 567771a Apply suggestions from code review (PastaPastaPasta) 2b3211a docs: add link to 22.1.3 release notes (pasta) 548a38a docs: remove individual release notes files (pasta) e770c25 docs: add v23.0.0 release notes and archive v22.1.3 (pasta) Pull request description: ## Issue being fixed or feature implemented Includes changes from 23.0.0 release too because we never merged it back. ## What was done? ## How Has This Been Tested? ## Breaking Changes ## Checklist: - [ ] I have performed a self-review of my own code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [ ] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ ACKs for top commit: PastaPastaPasta: utACK 491db4a kwvg: utACK 491db4a Tree-SHA512: 61895cd1f9d01ac7be1d407afe1ddd24b98e8242cb03229ecd586a4d2d1c43dbc62c98da52a8c715b3a5943fb40e99b23251e691f778779af9d6da94392122a3
Issue being fixed or feature implemented
Automatically verify and repair deterministic masternode list diffs in evodb during node startup. Helps detect and fix database corruption without manual intervention.
What was done?
DB_LIST_REPAIREDmarker to track when repair is complete-forceevodbrepairflag to force re-verificationcs_mainlockingHow Has This Been Tested?
Run a node, check logs
Breaking Changes
n/a
Checklist: