-
Notifications
You must be signed in to change notification settings - Fork 1.2k
refactor: consolidate activeMasternodeInfo{Cs} into CActiveMasternodeManager, create NodeContext alias, reduce globals usage #5940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Since this isn't backports; it'd be nice to have Clang-Diff format clean: https://github.com/dashpay/dash/actions/runs/8316932637/job/22757066218?pr=5940 |
I look at these clang-diff suggestions and I hate most of it tbh... |
I have an idea: #5942 |
src/governance/governance.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code is equivalent due to || property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 please revert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was a stylistic choice to keep the check for masternode mode and the check that relies on the previous check that masternode mode is active on separate lines. Reading the function without context gives the impression those are two separate conditions being checked while the second condition itself relies on the first condition's result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy will be annoyed about two if blocks next to each other with duplicate bodies
src/init.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why changed order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To enforce the correct order of changes.
Changes to the state (activeMasternodeInfo) must happen before changes to the manager (activeMasternodeManager), as the next commit changes activeMasternodeInfo to activeMasternodeManager->m_info, the following snippet would ensue if not for the ordering changes.
if (fMasternodeMode) {
UnregisterValidationInterface(activeMasternodeManager.get());
activeMasternodeManager.reset();
}
{
LOCK(activeMasternodeManager->cs);
// make sure to clean up BLS keys before global destructors are called (they have allocated from the secure memory pool)
activeMasternodeManager->m_info.blsKeyOperator.reset();
activeMasternodeManager->m_info.blsPubKeyOperator.reset();
}
src/init.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we active fMasternodeMode earlier as possible because it is widely used all over code base?
For instance, connman that now is initialized after fMasternodeMode flag uses it.
Does it indeed affect its implementation? I don't know, I haven't found a proof. Can it affect implementation? yes, an I can't say opposite before carefully checking of all objects initialized between line 1613 and line 1860 which are old and new initialization position of fMasternodeMode.
Other example is LLMQContext. if LLMQContext would be initialized before initialization of fMasternodeMode then worker qdkgsman would never start c5cd9c3#diff-3ce892d5580c8044e0b4ebc4d1fe596d919788682fc38ccd66913508ff8fc43dR80
so, I'd say that this code moving is a dangerous and could cause hidden bugs that can be discovered years later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be totally fine; as the call to LLMQContext::Start is also in AppInitMain; just a few hundred lines later: https://github.com/dashpay/dash/blob/d58e41514d7d93c61cbcd237c855674a87798f38/src/init.cpp#L1959;|
also my guess is that @kwvg was forced to due this by compiler / linter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but referring to commit: 513c3f9 I am curious why we are moving it down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code had to be moved somewhere.
Initialization of activeMasternodeInfo and activeMasternodeManager were disjoint since they were globals in their own right but since activeMasternodeInfo's contents were being subsumed into activeMasternodeManager, the state needed to be initialized after the manager (and later commits remove later initialization by adding it into the manager's constructor).
It was moved downwards instead of upwards because downwards was where the manager was being initialized and to avoid upsetting anything it was kept that way. As a bonus, keeping it downward meant I get to keep passing a dereferenced node.connman.
Is it possible that bugs may show up but the node doesn't start until much later and aside from CDKGSessionManager, I'm not aware of any constructors that check on "masternode mode" status.
src/masternode/node.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unrelated
src/masternode/node.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it someone will try to access this object when we are in desctructor - we have a bigger race and locking cs won't help already.
No need cs here
I see it is updated in commits later. But at first, you don't need descrtuctor even for unique_ptr, so, can be simplified before refactoring unique_ptr to members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That change was made to map where the changes from init.cpp went
src/masternode/node.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need forcely clean m_info; they would be desctroyed as a part of activeMasternodeManager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree; wouldn't default destructor do the same stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We got rid of the destructor in 4363e3c
src/init.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, why it is here, not in llmq-context? without llmq context activeMasternodeManager has no usage; and it is heavily used in llmq-context.
(not a blocker to get merged, just an opener for a discussion, either way is fine for me)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because activeMasternodeManager
- Relies on
CConnmanandCDeterministicMNManagerto exist (source) - It doesn't depend on LLMQ managers to exist (unlike most LLMQ managers, which depend on each other)
- It's used in
MNAuth(source) andCGovernanceManager(source)
It's better suited for a future MNContext after we deglobalize some more.
|
This pull request has conflicts, please rebase. |
src/evo/mnauth.cpp
Outdated
| if (!fMasternodeMode || activeMasternodeInfo.proTxHash.IsNull()) { | ||
| return; | ||
| } | ||
| if (activeMasternodeInfo.proTxHash.IsNull()) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a nice refactor to make activeMasternodeInfo into an optional; or some other type that cannot be accessed accidentally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
activeMasternodeInfo (now m_info) is an unconditionally initialized part of activeMasternodeManager, which is conditionally initialized. If we had to pass activeMasternodeManager as an optional reference, the argument list would be accepting an std::optional<std::reference_wrapper<CActiveMasternodeManager>> and unwrapping it would involve mn_activeman.value().get() (std::optional, std::reference_wrapper).
It can be made less contrived by using mn_activeman->get() but regardless, it would require that at the start of every function that uses it more than once, we'd have to setup something like...
auto mn_activeman = m_mn_activeman_ref.value().get();
// [...]
auto mnList = m_dmnman.GetListAtChainTip();
auto dmn = WITH_LOCK(mn_activeman.cs, return mnList.GetValidMNByCollateral(mn_activeman.GetOutPoint()));Arguably we have to do something similar with all the added assertions already but since the codebase is already used to that approach, we've continued on with it. Maybe we could revisit this approach but IMO it's out of the scope of this PR.
|
please consider: PastaPastaPasta@c52d169 |
src/governance/governance.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 please revert
src/evo/mnauth.cpp
Outdated
| uint256 myProTxHash; | ||
| if (fMasternodeMode) { | ||
| myProTxHash = WITH_LOCK(activeMasternodeInfoCs, return activeMasternodeInfo.proTxHash); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer something like
const uint256 myProTxHash = []() {
if (fMasternodeMode) {
return WITH_LOCK(activeMasternodeInfoCs, return activeMasternodeInfo.proTxHash);
} else {
return uint256();
}
}();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels overengineered for a variable that doesn't get many opportunities to be mutated (first by CActiveMasternodeManager::GetProTxHash and then by llmq::utils::DeterministicOutboundConnection, which already accepts a const ref).
If the value was passed through multiple functions and is a part of a large function, the tradeoff would be worth the certainty the value was never mutated but this doesn't seem like such a case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels overengineered
only in C++ :D any other modern language would have a syntax that makes this feel more build in instead of "engineered" but the only good way to do think in c++ is to use a lambda.
let myProTxHash = if fMasternodeMode {
WITH_LOCK(activeMasternodeInfoCs, return activeMasternodeInfo.proTxHash)
} else {
uint256()
}I guess in this instance we can do
const uint256 myProTxHash = fMasternodeMode ?
WITH_LOCK(activeMasternodeInfoCs, return activeMasternodeInfo.proTxHash) :
uint256();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter is acceptable to me but the moment we have to add additional assertions, the former suggestion seems to be the only way to retain constness.
It's been trimmed down though to be more similar to...
const uint256 myProTxHash = [this]() {
if (!fMasternodeMode) return uint256();
assert(m_mn_activeman);
return WITH_LOCK(m_mn_activeman->cs, return m_mn_activeman->GetProTxHash());
}();Resolved in latest push
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be simplified a bit more:
const uint256 myProTxHash = [this]() {
if (!fMasternodeMode) return uint256{};
return WITH_LOCK(Assert(m_mn_activeman)->cs, return m_mn_activeman->GetProTxHash());
}();
anyway, this lock cs would be removed once #5940 (comment) is done
and it would become just
const uint256 myProTxHash{fMasternodeMode ? Assert(m_mn_activeman)->GetProTxHash() : uint256{}};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WITH_LOCK(Assert(m_mn_activeman)->cs, return m_mn_activeman->GetProTxHash()); doesn't work, I've mentioned this in the PR description.
You cannot combine a
WITH_LOCKand anAssert(in either mutex or accessed value), doing so will cause errors if-Werror=thread-safetyis enabled. This is whyasserts are added even when it would intuitively seem thatAssertwould've been more appropriate to use.
src/init.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but referring to commit: 513c3f9 I am curious why we are moving it down?
src/llmq/signing_shares.cpp
Outdated
| if (!fMasternodeMode || WITH_LOCK(activeMasternodeInfoCs, return activeMasternodeInfo.proTxHash.IsNull())) { | ||
| return; | ||
| } | ||
| if (!fMasternodeMode) return; | ||
| if (WITH_LOCK(activeMasternodeInfoCs, return activeMasternodeInfo.proTxHash.IsNull())) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should revert
src/masternode/node.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree; wouldn't default destructor do the same stuff?
| return m_info.proTxHash; | ||
| } | ||
|
|
||
| [[nodiscard]] CBLSPublicKey GetPubKey() const EXCLUSIVE_LOCKS_REQUIRED(cs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should lock internally if we are returning a copy; an external lock should only be required for a reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are locking externally as opposed to internally to minimize the amount of code modification (existing code that read with activeMasternodeInfo used to hold the lock before attempting to read from it) and reducing the amount of lock-unlock events (functions that tend to read one value tend to read more).
Internal locking was used for functions introduced in this PR like Decrypt and Sign, because its usage doesn't usually involve accessing other members (where it might be reasonable to refactor the code to have all accesses under one locked scope), the exception to that rule is CMNAuth::PushMNAUTH, which I had to refactor so that it fetches values under one externally locked scope and signs outside, with locking done internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. Maybe let's add a TODO then? I do think we should only lock externally if we are getting a ref (or something else lifetime bound)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, resolved in latest push
|
This pull request has conflicts, please rebase. |
… mode A later commit will be moving activeMasternodeInfo into activeMasternodeManager and that is only conditionally initialized if the node is in masternode mode, which will render access attempts outside of masternode mode invalid. We need to adjust behaviour to account for that.
Avoid passing around the operator secret key if we can help it. Ask CActiveMasternodeManager to perform the operation for you instead.
… functions External logic should not be able to mutate the CActiveMasternodeManager state (i.e. CActiveMasternodeInfo). Access is brokered through getter functions.
Using unique_ptr is a relic of activeMasternodeInfo, as it was accessible independent of CActiveMasternodeManager. As it is now only accessible if CActiveMasternodeManager is initialized, we can make it part of its constructor and make them const. Removes the assertion and sanity checks to see if key pair is valid.
We could use std::optional<std::reference_wrapper<const CActiveMasternodeManager>> but then we'd also have to contend with accessing the value with mn_activeman. value().get(). We assert m_mn_activeman is present instead of fast-failing when it isn't because of the fMasternodeMode fast-fail check. If we are in masternode mode, m_mn_activeman should point to a valid target. If it doesn't, something's gone wrong.
knst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 815e4f8
PastaPastaPasta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 815e4f8
UdjinM6
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, light ACK
…overnance global b4477e4 trivial: don't print `fDisableGovernance` value anymore (Kittywhiskers Van Gogh) a42370d refactor: remove fDisableGovernance global, define default in variable (Kittywhiskers Van Gogh) b152759 refactor: remove fMasternodeMode and fDisableGovernance from Qt code (Kittywhiskers Van Gogh) 9402ce7 refactor: limit usage of fDisableGovernance, use `IsValid()` instead (Kittywhiskers Van Gogh) 106f6bd refactor: reduce fMasternodeMode usage in governance and mnauth (Kittywhiskers Van Gogh) 3ba293f refactor: remove fMasternodeMode checks in CActiveMasternodeManager (Kittywhiskers Van Gogh) b0216ac refactor: remove fMasternodeMode usage in rpc logic (Kittywhiskers Van Gogh) 4d629a0 refactor: limit fMasternodeMode usage in blockstorage, init, net_processing (Kittywhiskers Van Gogh) a9cbdfc refactor: remove fMasternodeMode usage from llmq logic (Kittywhiskers Van Gogh) c62a3d5 refactor: remove fMasternodeMode usage from coinjoin logic (Kittywhiskers Van Gogh) Pull request description: ## Motivation Since #5940, `CActiveMasternodeManager` ceased to be a global variable and became a conditional smart pointer initialized based on the value of `fMasternodeMode`. Likewise, since #5555, we can tell if any `CFlatDB`-based manager has successfully loaded its database. `CGovernanceManager` is one of them and conditionally loads its database based on the value of `fGovernanceDisabled`. `fMasternodeMode` and `fGovernanceDisabled` were (and the former to a certain degree still is) unavoidable globals due to the way the functionality they influenced was structured (i.e. decided in initialization code with no way to query from the manager itself). As we can directly ask the managers now, we can start reducing the usage of these globals and at least in this PR, get rid of one of them. This PR was the idea of PastaPastaPasta, special thanks for the suggestion! ## Additional Information * There are two conventions being used for checking `nullptr`-ity of a pointer, `if (mn_activeman)` and `if (mn_activeman != nullptr)`. The former is used in initialization and RPC code due to existing conventions there ([source](https://github.com/dashpay/dash/blob/2dacfb08bdf02fdaf0740edac19435bfaa0e52d3/src/init.cpp#L1659-L1677), [source](https://github.com/dashpay/dash/blob/2dacfb08bdf02fdaf0740edac19435bfaa0e52d3/src/rpc/net.cpp#L942-L945), [source](https://github.com/dashpay/dash/blob/2dacfb08bdf02fdaf0740edac19435bfaa0e52d3/src/rpc/misc.cpp#L215-L218)). The latter is used whenever the value has to be passed as a `bool` (you cannot pass the implicit conversion to a `bool` argument without explicitly casting it) and in Dash-specific code where it is the prevalent convention ([source](https://github.com/dashpay/dash/blob/2dacfb08bdf02fdaf0740edac19435bfaa0e52d3/src/governance/governance.cpp#L125), [source](https://github.com/dashpay/dash/blob/2dacfb08bdf02fdaf0740edac19435bfaa0e52d3/src/coinjoin/client.cpp#L1064)). Unfortunately, that means this PR expresses the same thing sometimes in two different ways but this approach was taken so that reading is consistent within the same file. Codebase-wide harmonization is outside the scope of this PR. * Where `mn_activeman` isn't directly available, the result of the check is passed as an argument named `is_masternode` and/or set for the manager during its construction as `m_is_masternode` (`const bool`) as it is expected for the `CActiveMasternodeManager`'s presence or absence to remain as-is for the duration of the manager's lifetime. This does mean that some parts of the codebase check for `mn_activeman` while others check for {`m_`}`is_masternode`, which does reduce clarity while reading. Suggestions on improving this are welcomed. * One of the reasons this PR was made was to avoid having to deal the _possibility_ of `fMasternodeMode` or `fDisableGovernance` from desynchronizing from the behaviour of the managers it's suppose to influence. It's why additional assertions were placed in to make sure that `fMasternodeMode` and the existence of `mn_activeman` were always in sync ([source](https://github.com/dashpay/dash/blob/2dacfb08bdf02fdaf0740edac19435bfaa0e52d3/src/evo/mnauth.cpp#L137-L139), [source](https://github.com/dashpay/dash/blob/2dacfb08bdf02fdaf0740edac19435bfaa0e52d3/src/rpc/governance.cpp#L319-L320)). But removing the tracking global and relying on a manager's state itself prevents a potential desync, which is what this PR is aiming to do. ## Breaking Changes None expected. ## Checklist: - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have added or updated relevant unit/integration/functional/e2e tests - [x] I have made corresponding changes to the documentation **(note: N/A)** - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ Top commit has no ACKs. Tree-SHA512: 7861afd17c83b92af4c95b2841e9b0f676116eb3f234c4d0b1dcd3c20395452893e8ca3a17c7225389c8411dac80aeb5050f06a2ae35df5ec48998a571ef120c
Additional Information
CActiveMasternodeManager, unlike other managers, is conditionally initialized (specifically, when the node is hosting a masternode). This means that checks need to be made to ensure that the conditions needed to initialize the manager are true or that the pointer leads to a valid manager instance.As the codebase currently checks (and fast-fails) based on the node being in "masternode mode" (
fMasternodeMode) or not, we will continue with this approach, but with additional assertions after the masternode mode check if the manager exists.Though, since
activeMasternodeInfo(Cs) are global variables, they can be accessed regardless of whether the corresponding manager exists. This means some parts of the codebase attempt to fetch information about the (nonexistent) active masternode before determining if it should use the masternode mode path or not (looking at you,CMNAuth::ProcessMessage)Moving them into
CActiveMasternodeManagermeant adding checks before attempting to access information about the masternode, as they would no longer be accessible with dummy values (here) on account of being part of the conditionally initialized manager.In an attempt to opportunistically dereference the manager,
CDKGSessionManager(accepting a pointer) was dereferencing the manager before passing it toCDKGSessionHandler. This was done under the assumption thatCDKGSessionManagerwould only ever be initialized in masternode mode.This is not true. I can confirm that because I spent a few days trying to debug test failures.
CDKGSessionHandleris initialized in two scenarios:-watchquorumsflag is enabledThe latter scenario doesn't initialize
CActiveMasternodeManager.Furthermore, the DKG round thread is started unconditionally (here) and the
CDKGSessionHandler::StartThreads>CDKGSessionHandler::StartThread>CDKGSessionHandler::PhaseHandlerThread>CDKGSessionHandler::HandleDKGRound>CDKGSessionHandler::InitNewQuorum>CActiveMasternodeManager::GetProTxHashcall chain reveals an attempt to fetch active masternode information without any masternode mode checks.This behaviour has now been changed and the thread will only be spun up if in masternode mode.
Dereferencing so far has been limited to objects that primarily hold data (like
CCoinJoinBroadcastTxorCGovernanceObject) as they should not have knowledge of node's state (that responsibility lies with whatever manager manipulates those objects), perform one-off operations and static functions.activeMasternodeInfoallowed its members to be read-write accessible to anybody who asked. Additionally, signing and decrypting involved borrowing the operator secret key from the active masternode state to perform those operations.This behaviour has now been changed. The internal state is now private and accessible read-only as a const ref (or copy) and
Decrypt/Signfunctions have been implemented to allow those operations to happen without having another manager access the operator private key in order to do so.You cannot combine a
WITH_LOCKand anAssert(in either mutex or accessed value), doing so will cause errors if-Werror=thread-safetyis enabled. This is whyasserts are added even when it would intuitively seem thatAssertwould've been more appropriate to use.Future Considerations
Currently there are no unit tests that test the functionality of
CActiveMasternodeManageras it's never initialized in test contexts, breakage had to be found using functional tests. Perhaps some (rudimentary) tests forCActiveMasternodeManagermay prove to be valuable.Breaking Changes
Not really. Some behaviour has been modified but nothing that should necessitate updates or upgrades.
Checklist: