Skip to content

Commit fd46c4c

Browse files
Merge dashpay#5962: fix: deadlock over cs_main and contributionsCacheCs in dkssessionmgr
ded1b5a fix: deadlock over cs_main and contributionsCacheCs in dkssessionmgr (Konstantin Akimov) Pull request description: ## Issue being fixed or feature implemented **It fixes rpc failure: "Work queue depth exceeded"** As I checked on running `dashd` in deadlock condition: Thread 78 is a thread that keep `cs_main`: ``` #14 0x0000aaaad1f8d604 in BuildSimplifiedMNListDiff () at evo/simplifiedmns.cpp:364 ``` but it is locked by `contributionsCacheCs` ``` #8 llmq::CDKGSessionManager::GetVerifiedContributions () at llmq/dkgsessionmgr.cpp:392 ``` On other hand, `contributionsCacheCs` is blocked by Thread 59 ``` #17 0x0000aaaad1ba1940 in llmq::CDKGSessionManager::GetVerifiedContributions () at llmq/dkgsessionmgr.cpp:393 ``` and it makes circuit lock by waiting `cs_main` in ``` #9 ReadBlockFromDisk () at node/blockstorage.cpp:75 ``` See dashpay/dash-issues#69 for more details Seems introduced there: dashpay#3911 ## What was done? Deadlock is removed by reducing scope of mutex ## How Has This Been Tested? I reviewed 2 different servers which have status `work queue exceeded`, both have same deadlock, so, this patch should fix this issue. Once this fix is merged, we can test it on testnet. ## Breaking Changes N/A ## Checklist: - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone Top commit has no ACKs. Tree-SHA512: 4fe5c03c464ee6934fb927b897f007b65a8995723196edaffdae067edee7067da151130d4c4bac47d3418fdad5c8e130682f42d7ef9c044380a8c8fff78ee008
1 parent ab7bc12 commit fd46c4c

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

src/llmq/dkgsessionmgr.cpp

+3-1
Original file line numberDiff line numberDiff line change
@@ -389,7 +389,6 @@ void CDKGSessionManager::WriteEncryptedContributions(Consensus::LLMQType llmqTyp
389389

390390
bool CDKGSessionManager::GetVerifiedContributions(Consensus::LLMQType llmqType, const CBlockIndex* pQuorumBaseBlockIndex, const std::vector<bool>& validMembers, std::vector<uint16_t>& memberIndexesRet, std::vector<BLSVerificationVectorPtr>& vvecsRet, std::vector<CBLSSecretKey>& skContributionsRet) const
391391
{
392-
LOCK(contributionsCacheCs);
393392
auto members = utils::GetAllQuorumMembers(llmqType, pQuorumBaseBlockIndex);
394393

395394
memberIndexesRet.clear();
@@ -398,6 +397,9 @@ bool CDKGSessionManager::GetVerifiedContributions(Consensus::LLMQType llmqType,
398397
memberIndexesRet.reserve(members.size());
399398
vvecsRet.reserve(members.size());
400399
skContributionsRet.reserve(members.size());
400+
401+
// NOTE: the `cs_main` should not be locked under scope of `contributionsCacheCs`
402+
LOCK(contributionsCacheCs);
401403
for (const auto i : irange::range(members.size())) {
402404
if (validMembers[i]) {
403405
const uint256& proTxHash = members[i]->proTxHash;

0 commit comments

Comments
 (0)