-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: instantsend cache heights and reduce cs_main contention #6953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
feat: instantsend cache heights and reduce cs_main contention #6953
Conversation
- Add unordered_lru_cache for block heights in CInstantSendManager - Provide GetTipHeight() (non-optional) and GetBlockHeight() helpers - Use cache in InstantSendSigner::CheckCanLock - Cache-first for cycleHash height in ISLOCK message path and batch verify - Cache-first mined-height for HasChainLock and WriteInstantSendLockMined Only InstantSend files are included in this commit.
|
|
Guix Automation has began to build this PR tagged as v23.0.0-devpr6953.ce5601c6. A new comment will be made when the image is pushed. |
WalkthroughThe patch adds an explicit block-height cache to CInstantSendManager (m_cached_block_heights, m_cached_tip_height) with CacheBlockHeight/CacheBlockHeightInternal/GetBlockHeight/GetTipHeight APIs. ISLOCK processing (ProcessMessage, ProcessPendingInstantSendLocks, ProcessInstantSendLock, ResolveBlockConflicts, RemoveConflictingLock) now consults and updates the height cache instead of direct CBlockIndex lookups; mined heights are persisted when resolved. BlockConnected, BlockDisconnected, and UpdatedBlockTip synchronize and maintain the cache and cached tip height. Locking annotations were updated to require cs_height_cache where appropriate. Sequence Diagram(s)sequenceDiagram
participant P2P as P2P Message
participant ISM as InstantSendManager
participant Cache as HeightCache (m_cached_block_heights)
participant Chain as ChainState
participant Mempool as Mempool/UTXO
P2P->>ISM: Receive ISLOCK (ProcessMessage)
ISM->>Cache: GetBlockHeight(cycleHash)
alt cached
Cache-->>ISM: return height
else not cached
ISM->>Chain: LookupBlockIndex(cycleHash)
Chain-->>ISM: return blockIndex/height?
ISM->>Cache: CacheBlockHeight(cycleHash,height)
end
ISM->>ISM: validate cycleHeight vs dkgInterval
ISM->>ISM: ProcessInstantSendLock(...)
ISM->>Mempool: Find TX / check inputs
alt TX mined
ISM->>Cache: GetBlockHeight(hashBlock)
alt cached
Cache-->>ISM: minedHeight
else not cached
ISM->>Chain: LookupBlockIndex(hashBlock)
Chain-->>ISM: return height
ISM->>Cache: CacheBlockHeight(hashBlock,height)
end
ISM->>ISM: WriteInstantSendLockMined(minedHeight)
else TX unmined
ISM->>ISM: handle mempool path / mark pending
end
sequenceDiagram
participant Node as Node
participant ISM as InstantSendManager
participant Cache as HeightCache
Node->>ISM: BlockConnected(newBlock)
ISM->>Cache: CacheBlockHeight(newBlock.hash, newBlock.height)
ISM->>Cache: Update m_cached_tip_height
Node->>ISM: BlockDisconnected(oldBlock)
ISM->>Cache: Invalidate/Adjust cached entries for oldBlock and recompute new tip
ISM->>Cache: CacheBlockHeight(newTip.hash,newTip.height) (if known)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas requiring extra attention:
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6953.ce5601c6. The image should be on dockerhub soon. |
|
Guix Automation has began to build this PR tagged as v23.0.0-devpr6953.8fc12bc1. A new comment will be made when the image is pushed. |
|
Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6953.8fc12bc1. The image should be on dockerhub soon. |
|
Guix Automation has began to build this PR tagged as v23.0.0-devpr6953.8a5b845a. A new comment will be made when the image is pushed. |
|
Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6953.8a5b845a. The image should be on dockerhub soon. |
- Updated log messages in CInstantSendManager to enhance clarity and maintain consistency. - Continued log message formatting for better readability. - No functional changes were made to the logic.
|
Guix Automation has began to build this PR tagged as v23.0.0-devpr6953.8bdb71d4. A new comment will be made when the image is pushed. |
|
Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6953.8bdb71d4. The image should be on dockerhub soon. |
| Uint256HashMap<int64_t> timingsTxSeen GUARDED_BY(cs_timingsTxSeen); | ||
|
|
||
| mutable Mutex cs_height_cache; | ||
| static constexpr size_t MAX_BLOCK_HEIGHT_CACHE{16384}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't 16k a bit too much? rotation quorums are not alive that much; also it's very unlikely situation that tx will exist that long before it will be mined.
16k is roughly 28 days
|
Deployed analysis during load on testnet. Note; that these 2 nodes can't be compared 1:1, they may be slightly differently connected, but we see significant improvements. 🔍 Overall Contention Summary
Interpretation: ⚙️ Top Locks by Total Contention Time1. Major Improvements
2. New or Increased Contention
📈 Top Locks by Contention Count
Observations:
🔄 Structural & Behavioral Shifts✅ Improved
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/instantsend/instantsend.h (1)
97-97: Reconsider the cache size constant.The MAX_BLOCK_HEIGHT_CACHE constant is set to 16384, which represents approximately 28 days of blocks (at ~2.5 minute block time). As noted in previous review feedback, this may be excessive for the typical lifetime of transactions in the InstantSend flow. Consider reducing this to a more reasonable value based on typical InstantSend lock processing times and memory constraints.
🧹 Nitpick comments (2)
src/instantsend/instantsend.cpp (2)
131-141: Consider minor optimization in cache update.When the block index is found via fallback lookup (line 133-138), you're calling
blockIndex->GetBlockHash()again on line 139. Sinceislock->cycleHashshould already match the block hash, you could useislock->cycleHashdirectly for caching.Apply this diff:
- CacheBlockHeight(blockIndex->GetBlockHash(), blockIndex->nHeight); + CacheBlockHeight(islock->cycleHash, blockIndex->nHeight);
391-399: Simplify fallback logic by relying on GetBlockHeight.The fallback pattern here (lines 393-399) duplicates logic already present in
GetBlockHeight(lines 977-995). SinceGetBlockHeightalready performs the lookup and caching internally, the explicit fallback here may be redundant. The current code callsGetBlockHeightonce (line 391), then does manual lookup and caching if it returns nullopt (lines 393-399).However,
GetBlockHeightalready does this internally. The only scenario where this explicit fallback helps is if you need to ensure caching happens even when the block isn't initially found. ButGetBlockHeightalready caches on successful lookup.Consider whether this duplication is necessary or if a single
GetBlockHeightcall suffices.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/instantsend/instantsend.cpp(9 hunks)src/instantsend/instantsend.h(6 hunks)src/instantsend/signing.cpp(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/instantsend/signing.cpp
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: kwvg
Repo: dashpay/dash PR: 6543
File: src/wallet/receive.cpp:240-251
Timestamp: 2025-02-06T14:34:30.466Z
Learning: Pull request #6543 is focused on move-only changes and refactoring, specifically backporting from Bitcoin. Behavior changes should be proposed in separate PRs.
Learnt from: knst
Repo: dashpay/dash PR: 6692
File: src/llmq/blockprocessor.cpp:217-224
Timestamp: 2025-08-19T14:57:31.801Z
Learning: In PR #6692, knst acknowledged a null pointer dereference issue in ProcessBlock() method where LookupBlockIndex may return nullptr but is passed to gsl::not_null, and created follow-up PR #6789 to address it, consistent with avoiding scope creep in performance-focused PRs.
📚 Learning: 2025-08-19T14:57:31.801Z
Learnt from: knst
Repo: dashpay/dash PR: 6692
File: src/llmq/blockprocessor.cpp:217-224
Timestamp: 2025-08-19T14:57:31.801Z
Learning: In PR #6692, knst acknowledged a null pointer dereference issue in ProcessBlock() method where LookupBlockIndex may return nullptr but is passed to gsl::not_null, and created follow-up PR #6789 to address it, consistent with avoiding scope creep in performance-focused PRs.
Applied to files:
src/instantsend/instantsend.cpp
📚 Learning: 2025-07-15T14:53:04.819Z
Learnt from: knst
Repo: dashpay/dash PR: 6691
File: src/test/llmq_params_tests.cpp:148-151
Timestamp: 2025-07-15T14:53:04.819Z
Learning: In the Dash Core LLMQ implementation, signingActiveQuorumCount is never 0 in the actual parameters defined in params.h, making division by zero scenarios unrealistic in the max_cycles() function.
Applied to files:
src/instantsend/instantsend.cpp
📚 Learning: 2025-05-05T12:45:44.781Z
Learnt from: knst
Repo: dashpay/dash PR: 6658
File: src/evo/creditpool.cpp:177-185
Timestamp: 2025-05-05T12:45:44.781Z
Learning: The GetAncestor() function in CBlockIndex safely handles negative heights by returning nullptr rather than asserting, making it safe to call with potentially negative values.
Applied to files:
src/instantsend/instantsend.cpp
📚 Learning: 2025-10-21T11:09:34.688Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6849
File: src/governance/governance.cpp:1339-1343
Timestamp: 2025-10-21T11:09:34.688Z
Learning: In the Dash Core codebase, `CacheMap` (defined in src/cachemap.h) is internally thread-safe and uses its own `mutable CCriticalSection cs` to protect access to its members. Methods like `GetSize()`, `Insert()`, `Get()`, `HasKey()`, etc., can be called without holding external locks.
Applied to files:
src/instantsend/instantsend.h
📚 Learning: 2025-01-02T21:50:00.967Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6504
File: src/llmq/context.cpp:42-43
Timestamp: 2025-01-02T21:50:00.967Z
Learning: LLMQContext manages concurrency for the `CInstantSendManager`. Previously, this was handled globally; now it's handled as a class member in `LLMQContext`, but the concurrency control remains consistent.
Applied to files:
src/instantsend/instantsend.h
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().
Applied to files:
src/instantsend/instantsend.h
🧬 Code graph analysis (2)
src/instantsend/instantsend.cpp (1)
src/node/interfaces.cpp (23)
hash(893-910)hash(893-893)hash(932-936)hash(932-932)hash(958-963)hash(958-958)LOCK(581-585)LOCK(591-598)LOCK(599-606)LOCK(868-877)LOCK(911-915)LOCK(1095-1099)height(563-573)height(563-563)height(878-885)height(878-878)height(886-892)height(886-886)height(937-941)height(937-937)tip(97-105)tip(586-590)tip(607-615)
src/instantsend/instantsend.h (2)
src/saltedhasher.h (1)
uint256(38-44)src/instantsend/instantsend.cpp (14)
CacheBlockHeightInternal(965-969)CacheBlockHeightInternal(965-965)ProcessPendingInstantSendLocks(183-246)ProcessPendingInstantSendLocks(183-183)ProcessPendingInstantSendLocks(248-360)ProcessPendingInstantSendLocks(248-251)ProcessInstantSendLock(362-449)ProcessInstantSendLock(362-363)CacheBlockHeight(971-975)CacheBlockHeight(971-971)GetBlockHeight(977-995)GetBlockHeight(977-977)GetTipHeight(997-1015)GetTipHeight(997-997)
🪛 GitHub Actions: Clang Diff Format Check
src/instantsend/instantsend.cpp
[error] 393-395: Clang format differences found. Apply clang-format changes to src/instantsend/instantsend.cpp. Exit code 1 due to unformatted code.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: linux64_tsan-build / Build source
- GitHub Check: arm-linux-build / Build source
- GitHub Check: win64-build / Build source
- GitHub Check: mac-build / Build source
- GitHub Check: linux64_ubsan-build / Build source
- GitHub Check: linux64_fuzz-build / Build source
- GitHub Check: linux64-build / Build source
- GitHub Check: linux64_sqlite-build / Build source
- GitHub Check: linux64_nowallet-build / Build source
- GitHub Check: Lint / Run linters
🔇 Additional comments (10)
src/instantsend/instantsend.cpp (9)
25-25: LGTM!Adding the cassert include is appropriate for the assert statements used in the implementation.
280-292: LGTM!The height cache integration correctly uses
GetBlockHeightto fetch the cycle height andGetTipHeightto determine the appropriate signing height for quorum selection. The thread-safe cache access reduces cs_main contention as intended.
509-513: LGTM!The block height caching during connection correctly maintains the cache and updates the tip height atomically under the lock. This ensures the cache stays synchronized with chain state.
542-550: LGTM with minor observation.The cache invalidation logic correctly removes the disconnected block and updates the tip. The code handles the case where there's no previous block (genesis disconnect) by setting
m_cached_tip_heightto -1.One minor note: when caching
new_tip(line 548), we assumepprevis a valid block. This is safe becausepprevis either a valid ancestor or nullptr, and we guard the caching withif (new_tip).
673-677: LGTM!The tip height caching in
UpdatedBlockTipcorrectly maintains synchronization between the chain state and the height cache.
859-859: LGTM!Using
GetTipHeighthere reduces cs_main contention as intended by the PR objectives.
965-975: LGTM!The internal/external caching method pattern correctly separates lock acquisition from the locked operation, allowing efficient batch caching when the lock is already held.
977-995: LGTM!The
GetBlockHeightimplementation correctly:
- Returns
std::nulloptfor null/unknown hashes- Checks the cache first to avoid cs_main contention
- Falls back to block index lookup when necessary
- Caches the result before returning
This achieves the PR's goal of reducing cs_main contention in the InstantSend flow.
997-1015: Document assumption in GetTipHeight assertion.The assertion at line 1007 (
assert(tip != nullptr)) assumes the chain tip is always valid. While this is typically true after genesis, consider whether this is always guaranteed in all call contexts (e.g., during initialization or chain reorgs).If this assumption is architectural (i.e., GetTipHeight should never be called before the chain is initialized), the assert is appropriate but could benefit from a comment explaining the invariant. Otherwise, consider returning a sentinel value or throwing a more descriptive error.
Based on learnings about safe handling of potential nullptr returns from chain operations.
src/instantsend/instantsend.h (1)
98-99: StaticSaltedHasher is properly compatible with uint256.The
SaltedHasherImpl<uint256>specialization exists and usesSipHashUint256(k0, k1, v)for hashing uint256 values. StaticSaltedHasher's templatedoperator()delegates toSaltedHasherImpl<T>::CalcHash, which for uint256 keys provides proper cryptographic distribution. This pattern is used consistently throughout the codebase for LRU caches with uint256 keys (e.g., masternode/meta.h, llmq/signing.h), and there's even a dedicatedUint256LruHashMaptype alias for this use case.
| std::optional<int> minedHeight = GetBlockHeight(hashBlock); | ||
| if (found_transaction) { | ||
| if (!minedHeight.has_value()) { | ||
| const CBlockIndex* pindexMined = WITH_LOCK(::cs_main, return m_chainstate.m_blockman.LookupBlockIndex(hashBlock)); | ||
| if (pindexMined != nullptr) { | ||
| CacheBlockHeight(pindexMined->GetBlockHash(), pindexMined->nHeight); | ||
| minedHeight = pindexMined->nHeight; | ||
| } | ||
| } | ||
| // Let's see if the TX that was locked by this islock is already mined in a ChainLocked block. If yes, | ||
| // we can simply ignore the islock, as the ChainLock implies locking of all TXs in that chain | ||
| if (pindexMined != nullptr && clhandler.HasChainLock(pindexMined->nHeight, pindexMined->GetBlockHash())) { | ||
| LogPrint(BCLog::INSTANTSEND, "CInstantSendManager::%s -- txlock=%s, islock=%s: dropping islock as it already got a ChainLock in block %s, peer=%d\n", __func__, | ||
| islock->txid.ToString(), hash.ToString(), hashBlock.ToString(), from); | ||
| if (minedHeight.has_value() && clhandler.HasChainLock(*minedHeight, hashBlock)) { | ||
| LogPrint(BCLog::INSTANTSEND, /* Continued */ | ||
| "CInstantSendManager::%s -- txlock=%s, islock=%s: dropping islock as it already got a " /* Continued */ | ||
| "ChainLock in block %s, peer=%d\n", | ||
| __func__, islock->txid.ToString(), hash.ToString(), hashBlock.ToString(), from); | ||
| return {}; | ||
| } | ||
| } | ||
|
|
||
| if (found_transaction) { | ||
| db.WriteNewInstantSendLock(hash, islock); | ||
| if (pindexMined) { | ||
| db.WriteInstantSendLockMined(hash, pindexMined->nHeight); | ||
| if (minedHeight.has_value()) { | ||
| db.WriteInstantSendLockMined(hash, *minedHeight); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix clang-format issue.
The pipeline indicates clang-format differences at lines 393-395. Please run clang-format to resolve the formatting issue.
🧰 Tools
🪛 GitHub Actions: Clang Diff Format Check
[error] 393-395: Clang format differences found. Apply clang-format changes to src/instantsend/instantsend.cpp. Exit code 1 due to unformatted code.
🤖 Prompt for AI Agents
In src/instantsend/instantsend.cpp around lines 391 to 415, the submitted
changes trigger a clang-format mismatch specifically around lines 393-395;
reformat the file (or at least that region) with the project's clang-format
configuration to fix spacing/line breaks so the code matches style rules, then
stage and commit the formatted file. Ensure no semantic changes are
introduced—only whitespace/formatting adjustments.
|
Guix Automation has began to build this PR tagged as v23.0.0-devpr6953.13727fb3. A new comment will be made when the image is pushed. |
|
Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6953.13727fb3. The image should be on dockerhub soon. |
|
pls consider ffd84b6 |
Issue being fixed or feature implemented
Today's testing on testnet showed under load, cs_main was a major contender:
(however, note the major contenders from before #6912 / #6468 are now significantly reduced)
What was done?
How Has This Been Tested?
Breaking Changes
Please describe any breaking changes your code introduces
Checklist:
Go over all the following points, and put an
xin all the boxes that apply.