-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: convert CSigSharesManager to use Mutex instead of RecursiveMutex #6954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: convert CSigSharesManager to use Mutex instead of RecursiveMutex #6954
Conversation
|
|
This pull request has conflicts, please rebase. |
update lock requirements for various methods. This change enhances thread safety and simplifies locking semantics throughout the class.
e9c2fea to
8832775
Compare
WalkthroughThe diff changes CSigSharesManager's internal locking from a RecursiveMutex to a non-recursive Mutex and adds EXCLUSIVE_LOCKS_REQUIRED annotations across many public and private methods to specify required lock state (!cs, cs, !cs_pendingSigns). Several method signatures were updated to reflect these lock requirements. A new private helper ProcessRecoveredSigUnlocked(...) (LOCKS_EXCLUDED(cs)) was added. Single-member recovered-sign processing was changed to defer actual recovery handling: a local shared_ptr is created while holding the lock, the lock is released, then ProcessRecoveredSigUnlocked / sigman.ProcessRecoveredSig is invoked outside the main lock. No public API return types were changed. Sequence Diagram(s)sequenceDiagram
participant Caller
participant Manager as CSigSharesManager
participant cs as "Mutex cs"
Note over Manager: Many methods now require EXCLUSIVE_LOCKS_REQUIRED(!cs) or EXCLUSIVE_LOCKS_REQUIRED(cs)
Caller->>Manager: StartWorkerThread()
Note right of Manager: EXCLUSIVE_LOCKS_REQUIRED(!cs)
Caller->>cs: (must NOT hold)
Caller->>Manager: ProcessMessage(...)
Note right of Manager: EXCLUSIVE_LOCKS_REQUIRED(!cs)
Caller->>cs: (must NOT hold)
sequenceDiagram
participant TryRecover as TryRecoverSig
participant Manager as CSigSharesManager
participant cs as "Mutex cs"
participant SigMan as "CSigningManager (sigman)"
Note over TryRecover,Manager: Single-member recovery deferred to run unlocked
TryRecover->>cs: acquire
TryRecover->>TryRecover: detect single-member quorum
TryRecover->>TryRecover: create singleMemberRecoveredSig (shared_ptr) while locked
TryRecover->>cs: release
alt singleMemberRecoveredSig set
TryRecover->>Manager: ProcessRecoveredSigUnlocked(singleMemberRecoveredSig)
Note right of Manager: LOCKS_EXCLUDED(cs) — processing runs without main lock
Manager->>SigMan: sigman.ProcessRecoveredSig(...)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🧰 Additional context used📓 Path-based instructions (1)src/**/*.{cpp,h,cc,cxx,hpp}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🧠 Learnings (12)📓 Common learnings📚 Learning: 2025-07-29T14:32:48.369ZApplied to files:
📚 Learning: 2025-02-14T15:15:58.165ZApplied to files:
📚 Learning: 2025-07-09T15:05:36.250ZApplied to files:
📚 Learning: 2025-09-09T21:36:11.833ZApplied to files:
📚 Learning: 2025-09-09T21:36:11.833ZApplied to files:
📚 Learning: 2025-09-09T21:36:31.504ZApplied to files:
📚 Learning: 2025-09-09T21:36:31.504ZApplied to files:
📚 Learning: 2025-09-09T21:36:58.969ZApplied to files:
📚 Learning: 2025-09-09T21:36:58.969ZApplied to files:
📚 Learning: 2025-10-02T18:29:54.756ZApplied to files:
📚 Learning: 2025-01-02T21:50:00.967ZApplied to files:
🧬 Code graph analysis (1)src/llmq/signing_shares.h (1)
🪛 GitHub Actions: Clang Diff Format Checksrc/llmq/signing_shares.cpp[error] 803-808: Clang-format differences detected by clang-format-diff.py. Code changes include reformatting of singleMemberRecoveredSig construction. CI failed due to formatting changes not applied. ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
🔇 Additional comments (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/llmq/signing_shares.h(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.{cpp,h,cc,cxx,hpp}
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction
Files:
src/llmq/signing_shares.h
🧠 Learnings (10)
📓 Common learnings
Learnt from: kwvg
Repo: dashpay/dash PR: 6543
File: src/wallet/receive.cpp:240-251
Timestamp: 2025-02-06T14:34:30.466Z
Learning: Pull request #6543 is focused on move-only changes and refactoring, specifically backporting from Bitcoin. Behavior changes should be proposed in separate PRs.
Learnt from: kwvg
Repo: dashpay/dash PR: 6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().
Learnt from: PastaPastaPasta
Repo: dashpay/dash PR: 6738
File: src/evo/smldiff.h:91-94
Timestamp: 2025-07-22T14:38:45.606Z
Learning: In the Dash codebase, EXCLUSIVE_LOCKS_REQUIRED annotations are Clang thread safety annotations that are enforced at compile time on supported platforms like macOS. If callers don't properly hold the required locks, the build will fail with compile-time errors, not runtime issues.
Learnt from: kwvg
Repo: dashpay/dash PR: 6849
File: src/governance/governance.cpp:1339-1343
Timestamp: 2025-10-21T11:09:34.688Z
Learning: In the Dash Core codebase, `CacheMap` (defined in src/cachemap.h) is internally thread-safe and uses its own `mutable CCriticalSection cs` to protect access to its members. Methods like `GetSize()`, `Insert()`, `Get()`, `HasKey()`, etc., can be called without holding external locks.
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket operations are properly synchronized using the cs_net mutex. The m_sock and m_server members are GUARDED_BY(cs_net), and methods like Connect(), SendDirectly(), and ReconnectThread() use appropriate locking with cs_net to prevent race conditions on socket access.
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-07-29T14:32:48.369Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:58.969Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:31.504Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access is protected by the cs_net mutex. This mutex coordinates between the TCP send path in SendDirectly() and the reconnection operations in Connect()/Reconnect() methods, ensuring proper synchronization of socket state.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:31.504Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:58.969Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket operations are properly synchronized using the cs_net mutex. The m_sock and m_server members are GUARDED_BY(cs_net), and methods like Connect(), SendDirectly(), and ReconnectThread() use appropriate locking with cs_net to prevent race conditions on socket access.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-10-02T18:29:54.756Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6840
File: src/net_processing.cpp:2882-2886
Timestamp: 2025-10-02T18:29:54.756Z
Learning: Across net_processing.cpp, once LLMQContext (m_llmq_ctx) is asserted non-null, its subcomponents (e.g., isman, qdkgsman, quorum_block_processor) are treated as initialized and used without extra null checks.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-01-02T21:50:00.967Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6504
File: src/llmq/context.cpp:42-43
Timestamp: 2025-01-02T21:50:00.967Z
Learning: LLMQContext manages concurrency for the `CInstantSendManager`. Previously, this was handled globally; now it's handled as a class member in `LLMQContext`, but the concurrency control remains consistent.
Applied to files:
src/llmq/signing_shares.h
🧬 Code graph analysis (1)
src/llmq/signing_shares.h (2)
src/llmq/signing.cpp (12)
StartWorkerThread(710-718)StartWorkerThread(710-710)StopWorkerThread(720-730)StopWorkerThread(720-720)InterruptWorkerThread(732-735)InterruptWorkerThread(732-732)ProcessMessage(396-436)ProcessMessage(396-396)Cleanup(638-651)Cleanup(638-638)WorkThreadMain(737-749)WorkThreadMain(737-737)src/llmq/signing_shares.cpp (78)
StartWorkerThread(199-207)StartWorkerThread(199-199)StopWorkerThread(209-219)StopWorkerThread(209-209)RegisterAsRecoveredSigsListener(221-224)RegisterAsRecoveredSigsListener(221-221)UnregisterAsRecoveredSigsListener(226-229)UnregisterAsRecoveredSigsListener(226-226)InterruptWorkerThread(231-234)InterruptWorkerThread(231-231)ProcessMessage(236-313)ProcessMessage(236-236)AsyncSign(1602-1606)AsyncSign(1602-1602)CreateSigShare(1632-1700)CreateSigShare(1632-1632)ForceReAnnouncement(1703-1727)ForceReAnnouncement(1703-1703)HandleNewRecoveredSig(1729-1735)HandleNewRecoveredSig(1729-1729)SelectMemberForRecovery(856-869)SelectMemberForRecovery(856-856)AsyncSignIfMember(871-949)AsyncSignIfMember(871-873)NotifyRecoveredSig(951-954)NotifyRecoveredSig(951-951)ProcessMessageSigSesAnn(315-345)ProcessMessageSigSesAnn(315-315)ProcessMessageSigSharesInv(353-388)ProcessMessageSigSharesInv(353-353)ProcessMessageGetSigShares(390-418)ProcessMessageGetSigShares(390-390)ProcessMessageBatchedSigShares(420-472)ProcessMessageBatchedSigShares(420-420)ProcessMessageSigShare(474-523)ProcessMessageSigShare(474-474)VerifySigSharesInv(347-351)VerifySigSharesInv(347-347)PreVerifyBatchedSigShares(525-565)PreVerifyBatchedSigShares(525-526)CollectPendingSigSharesToVerify(567-636)CollectPendingSigSharesToVerify(567-569)ProcessPendingSigShares(638-704)ProcessPendingSigShares(638-638)ProcessPendingSigShares(707-720)ProcessPendingSigShares(707-709)ProcessSigShare(723-773)ProcessSigShare(723-723)TryRecoverSig(775-854)TryRecoverSig(775-775)GetSessionInfoByRecvId(155-169)GetSessionInfoByRecvId(155-155)GetSessionInfoByRecvId(1366-1370)GetSessionInfoByRecvId(1366-1366)RebuildSigShare(1372-1378)RebuildSigShare(1372-1372)Cleanup(1380-1514)Cleanup(1380-1380)RemoveSigSharesForSession(1516-1529)RemoveSigSharesForSession(1516-1516)RemoveBannedNodeStates(1531-1550)RemoveBannedNodeStates(1531-1531)BanNode(1552-1576)BanNode(1552-1552)SendMessages(1205-1232)SendMessages(1205-1205)CollectSigSharesToRequest(956-1051)CollectSigSharesToRequest(956-956)CollectSigSharesToSend(1053-1101)CollectSigSharesToSend(1053-1053)CollectSigSharesToSendConcentrated(1103-1146)CollectSigSharesToSendConcentrated(1103-1103)CollectSigSharesToAnnounce(1148-1203)CollectSigSharesToAnnounce(1148-1148)SignPendingSigShares(1608-1630)SignPendingSigShares(1608-1608)WorkThreadMain(1578-1600)WorkThreadMain(1578-1578)
🪛 GitHub Actions: Clang Diff Format Check
src/llmq/signing_shares.h
[error] 1-1: Clang format differences found. Run './contrib/devtools/clang-format-diff.py -p1' (as part of the diff check) or apply clang-format to src/llmq/signing_shares.h.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: mac-build / Build source
- GitHub Check: linux64_nowallet-build / Build source
- GitHub Check: linux64-build / Build source
- GitHub Check: linux64_ubsan-build / Build source
- GitHub Check: linux64_fuzz-build / Build source
- GitHub Check: linux64_sqlite-build / Build source
- GitHub Check: arm-linux-build / Build source
- GitHub Check: win64-build / Build source
🔇 Additional comments (1)
src/llmq/signing_shares.h (1)
427-490: Confirm that Clang thread safety analysis was enabled during build and add high-load stress testing.The code annotations appear internally consistent: methods with
EXCLUSIVE_LOCKS_REQUIRED(!cs)containLOCK(cs)calls, and methods withEXCLUSIVE_LOCKS_REQUIRED(cs)containAssertLockHeld(cs)checks. However, two critical verification gaps remain:
Compile-time verification: No evidence found in CI configuration files (
.github/workflows/build-src.yml) that the build included-Wthread-safetycompiler flags. This annotation system requires explicit build configuration to enforce checks at compile time, particularly on Clang. Please confirm the successful build was performed with Clang's thread safety analysis enabled.Runtime testing: No stress testing for CSigSharesManager found in the test suite. Given the concurrency-critical nature of signal share processing and the significant change from recursive to non-recursive locking, stress testing under high load is essential to validate the change beyond compile-time checks alone.
Added a new method, ProcessRecoveredSigUnlocked, to handle recovered signatures outside of the lock to improve thread safety. Updated TryRecoverSig to utilize this new method for single-member quorum cases, ensuring proper lock management during signature processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/llmq/signing_shares.cpp (1)
812-823: Minor optimization opportunity for single-member quorums.For single-member quorums,
sigSharesForRecoveryandidsForRecoveryare populated (lines 814-818) but never used since we return early at line 829. Consider skipping this population whensingleMemberRecoveredSigis set, or moving lines 812-823 after the early return check to avoid unnecessary work.Example refactor:
} } + // Handle single-member quorum case after releasing the lock + if (singleMemberRecoveredSig) { + ProcessRecoveredSigUnlocked(singleMemberRecoveredSig); + return; // end of single-quorum processing + } + sigSharesForRecovery.reserve((size_t) quorum.params.threshold); idsForRecovery.reserve((size_t) quorum.params.threshold); for (auto it = sigSharesForSignHash->begin(); it != sigSharesForSignHash->end() && sigSharesForRecovery.size() < size_t(quorum.params.threshold); ++it) { const auto& sigShare = it->second; sigSharesForRecovery.emplace_back(sigShare.sigShare.Get()); idsForRecovery.emplace_back(quorum.members[sigShare.getQuorumMember()]->proTxHash); } // check if we can recover the final signature if (sigSharesForRecovery.size() < size_t(quorum.params.threshold)) { return; } } - // Handle single-member quorum case after releasing the lock - if (singleMemberRecoveredSig) { - ProcessRecoveredSigUnlocked(singleMemberRecoveredSig); - return; // end of single-quorum processing - } - // now recover itNote: This would require adjusting the lock scope, so only pursue if the optimization is worthwhile.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/llmq/signing_shares.cpp(4 hunks)src/llmq/signing_shares.h(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.{cpp,h,cc,cxx,hpp}
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction
Files:
src/llmq/signing_shares.cppsrc/llmq/signing_shares.h
🧠 Learnings (11)
📓 Common learnings
Learnt from: kwvg
Repo: dashpay/dash PR: 6543
File: src/wallet/receive.cpp:240-251
Timestamp: 2025-02-06T14:34:30.466Z
Learning: Pull request #6543 is focused on move-only changes and refactoring, specifically backporting from Bitcoin. Behavior changes should be proposed in separate PRs.
Learnt from: kwvg
Repo: dashpay/dash PR: 6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.
Learnt from: kwvg
Repo: dashpay/dash PR: 6849
File: src/governance/governance.cpp:1339-1343
Timestamp: 2025-10-21T11:09:34.688Z
Learning: In the Dash Core codebase, `CacheMap` (defined in src/cachemap.h) is internally thread-safe and uses its own `mutable CCriticalSection cs` to protect access to its members. Methods like `GetSize()`, `Insert()`, `Get()`, `HasKey()`, etc., can be called without holding external locks.
📚 Learning: 2025-07-29T14:32:48.369Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
Applied to files:
src/llmq/signing_shares.cppsrc/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-10-02T18:29:54.756Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6840
File: src/net_processing.cpp:2882-2886
Timestamp: 2025-10-02T18:29:54.756Z
Learning: Across net_processing.cpp, once LLMQContext (m_llmq_ctx) is asserted non-null, its subcomponents (e.g., isman, qdkgsman, quorum_block_processor) are treated as initialized and used without extra null checks.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:58.969Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-02-14T15:15:58.165Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6529
File: src/rpc/governance.cpp:1074-1089
Timestamp: 2025-02-14T15:15:58.165Z
Learning: Code blocks marked with `// clang-format off` and `// clang-format on` directives should be excluded from clang-format suggestions as they are intentionally formatted manually for better readability.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:31.504Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access is protected by the cs_net mutex. This mutex coordinates between the TCP send path in SendDirectly() and the reconnection operations in Connect()/Reconnect() methods, ensuring proper synchronization of socket state.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:31.504Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-09-09T21:36:58.969Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket operations are properly synchronized using the cs_net mutex. The m_sock and m_server members are GUARDED_BY(cs_net), and methods like Connect(), SendDirectly(), and ReconnectThread() use appropriate locking with cs_net to prevent race conditions on socket access.
Applied to files:
src/llmq/signing_shares.h
📚 Learning: 2025-01-02T21:50:00.967Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6504
File: src/llmq/context.cpp:42-43
Timestamp: 2025-01-02T21:50:00.967Z
Learning: LLMQContext manages concurrency for the `CInstantSendManager`. Previously, this was handled globally; now it's handled as a class member in `LLMQContext`, but the concurrency control remains consistent.
Applied to files:
src/llmq/signing_shares.h
🧬 Code graph analysis (1)
src/llmq/signing_shares.h (1)
src/llmq/signing_shares.cpp (76)
StartWorkerThread(199-207)StartWorkerThread(199-199)StopWorkerThread(209-219)StopWorkerThread(209-209)RegisterAsRecoveredSigsListener(221-224)RegisterAsRecoveredSigsListener(221-221)UnregisterAsRecoveredSigsListener(226-229)UnregisterAsRecoveredSigsListener(226-226)InterruptWorkerThread(231-234)InterruptWorkerThread(231-231)ProcessMessage(236-313)ProcessMessage(236-236)AsyncSign(1614-1618)AsyncSign(1614-1614)CreateSigShare(1644-1712)CreateSigShare(1644-1644)ForceReAnnouncement(1715-1739)ForceReAnnouncement(1715-1715)HandleNewRecoveredSig(1741-1747)HandleNewRecoveredSig(1741-1741)SelectMemberForRecovery(868-881)SelectMemberForRecovery(868-868)ProcessMessageSigSesAnn(315-345)ProcessMessageSigSesAnn(315-315)ProcessMessageSigSharesInv(353-388)ProcessMessageSigSharesInv(353-353)ProcessMessageGetSigShares(390-418)ProcessMessageGetSigShares(390-390)ProcessMessageBatchedSigShares(420-472)ProcessMessageBatchedSigShares(420-420)ProcessMessageSigShare(474-523)ProcessMessageSigShare(474-474)VerifySigSharesInv(347-351)VerifySigSharesInv(347-347)PreVerifyBatchedSigShares(525-565)PreVerifyBatchedSigShares(525-526)CollectPendingSigSharesToVerify(567-636)CollectPendingSigSharesToVerify(567-569)ProcessPendingSigShares(638-704)ProcessPendingSigShares(638-638)ProcessPendingSigShares(707-720)ProcessPendingSigShares(707-709)ProcessSigShare(723-773)ProcessSigShare(723-723)TryRecoverSig(775-861)TryRecoverSig(775-775)ProcessRecoveredSigUnlocked(863-866)ProcessRecoveredSigUnlocked(863-863)GetSessionInfoByRecvId(155-169)GetSessionInfoByRecvId(155-155)GetSessionInfoByRecvId(1378-1382)GetSessionInfoByRecvId(1378-1378)RebuildSigShare(1384-1390)RebuildSigShare(1384-1384)Cleanup(1392-1526)Cleanup(1392-1392)RemoveSigSharesForSession(1528-1541)RemoveSigSharesForSession(1528-1528)RemoveBannedNodeStates(1543-1562)RemoveBannedNodeStates(1543-1543)BanNode(1564-1588)BanNode(1564-1564)SendMessages(1217-1244)SendMessages(1217-1217)CollectSigSharesToRequest(968-1063)CollectSigSharesToRequest(968-968)CollectSigSharesToSend(1065-1113)CollectSigSharesToSend(1065-1065)CollectSigSharesToSendConcentrated(1115-1158)CollectSigSharesToSendConcentrated(1115-1115)CollectSigSharesToAnnounce(1160-1215)CollectSigSharesToAnnounce(1160-1160)SignPendingSigShares(1620-1642)SignPendingSigShares(1620-1620)WorkThreadMain(1590-1612)WorkThreadMain(1590-1590)
🪛 GitHub Actions: Clang Diff Format Check
src/llmq/signing_shares.cpp
[error] 803-809: Clang format differences found. Formatting changes detected by clang-format-diff.py. The patch shows reflowed arguments across multiple lines in the creation of CRecoveredSig.
[warning] 803-803: Suggested formatting change identified by clang-format-diff.py. Review and apply to conform with project style guidelines.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build container / Build container
🔇 Additional comments (4)
src/llmq/signing_shares.h (2)
379-379: LGTM! Mutex conversion is appropriate.The change from
RecursiveMutextoMutexaligns with the PR objective to simplify locking semantics and reduce contention. The extensive lock annotations added throughout the class will help enforce non-recursive locking at compile time.
427-502: Comprehensive lock annotations added.The lock annotations are correctly applied across all methods:
- Methods that acquire
csinternally useEXCLUSIVE_LOCKS_REQUIRED(!cs)- Methods that expect
csalready held useEXCLUSIVE_LOCKS_REQUIRED(cs)ProcessRecoveredSigUnlockedcorrectly usesLOCKS_EXCLUDED(cs)to prevent it from being called while holding the lockThis should effectively prevent recursive locking patterns at compile time.
src/llmq/signing_shares.cpp (2)
783-830: Correctly defers processing to avoid recursive locking.The refactoring successfully avoids recursive locking by:
- Creating
singleMemberRecoveredSigwhile holdingcs(lines 806-807)- Releasing
csbefore callingProcessRecoveredSigUnlocked(line 828)- This prevents deadlock when
ProcessRecoveredSig→HandleNewRecoveredSig→LOCK(cs)The comment on lines 808-809 clearly explains the reasoning.
863-866: Clean helper method for unlocked processing.
ProcessRecoveredSigUnlockedprovides a clear delegation point for processing recovered signatures outside the lock. Both single-member and multi-member code paths use this helper for consistency.
| singleMemberRecoveredSig = std::make_shared<CRecoveredSig>(quorum.params.type, quorum.qc->quorumHash, id, msgHash, | ||
| recoveredSig); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix clang-format differences.
The pipeline reports clang-format differences on lines 803-809. Please run ./contrib/devtools/clang-format-diff.py -p1 and apply the suggested formatting for the CRecoveredSig construction.
🤖 Prompt for AI Agents
In src/llmq/signing_shares.cpp around lines 803 to 809, the construction of
CRecoveredSig is misformatted per clang-format; run
./contrib/devtools/clang-format-diff.py -p1 and apply its suggested changes so
the std::make_shared<CRecoveredSig>(...) call matches the project's clang-format
style (line-wrapping and indentation of the argument list), ensuring the opening
parenthesis, arguments, and closing parenthesis are aligned as the tool
prescribes and then re-run the formatter to confirm no remaining diffs.
Removed the ProcessRecoveredSigUnlocked method and directly integrated its functionality into TryRecoverSig. This change simplifies the handling of recovered signatures for single-member quorums, ensuring proper lock management and enhancing code clarity.
knst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK e98ba12
UdjinM6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK e98ba12
update lock requirements for various methods. This change enhances thread safety and simplifies locking semantics throughout the class.
Issue being fixed or feature implemented
After other recent lock contention improvements, this RecursiveMutex is now the source of the most contention during high load. Convert the mutex from Recursive to non-recursive. Surprisingly compiler is happy with this and we don't need more invasive changes
What was done?
How Has This Been Tested?
Builds
Breaking Changes
None
Checklist:
Go over all the following points, and put an
xin all the boxes that apply.