-
Notifications
You must be signed in to change notification settings - Fork 11.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor ConsensusHandler in preparation for parallel signature verification #5344
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea to remove code from authority.rs
if self | ||
.database | ||
.consensus_message_processed(certificate.digest()) | ||
.expect("Storage error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add some comments about why we panic upon this storage error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure: panicking here ensures that the consensus transaction will not be skipped (ie. validator stops/crashes and the execution indices do not update with future transactions), right? (because we cannot skip that transaction)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in general when we have storage error there is not much we can do. Yes, this will not skip transaction, but essentially will shutdown the node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me, thanks @andll 👍
@@ -1,3 +1,4 @@ | |||
use std::sync::Arc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably the license header checker will complain as well, this will need to be after the header comments
@@ -1114,7 +1114,7 @@ impl<S: Eq + Debug + Serialize + for<'de> Deserialize<'de>> SuiDataStore<S> { | |||
Ok(()) | |||
} | |||
|
|||
pub async fn consensus_message_processed( | |||
pub fn consensus_message_processed( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch 👍
…ication This PR refactors ConsensusHandler in order to prepare for parallel verification of the transaction signatures. On Sui side: (1) Break down consensus handling into two independent functions on the `AuthorityState` - one for (potentially parallel) verification and one for sequential handling of the transaction. (2) Move `ConsensusHandler` into separate file - `authority.rs` is growing big and parallel signature verification will add even more code to ConsensusHandler. (3) Abolish `NarwhalHandlerError` - after separating `handle_consensus_transaction` and `verify_consensus_transaction`, `ConsensusHandler` can now decide what to do based on what function has failed. Note that in practice nothing has changed in how errors are processed by this PR - previously we were mapping errors from `verify_narwhal_transaction` into `NarwhalHandlerError::Skip`, and now we just call it directly from `ConsensusHandler`. (4) Two utilization metrics on the `ConsensusHandler` side, namely `verify_narwhal_transaction_duration_mcs` and `handle_consensus_duration_mcs` slightly changed what they measure - previously verification metric was fraction of handle_consensus timer, now they are two independent steps. The total consensus utilization can now be measured as sum of those two metrics (until parallel verification is introduced). On Narwhal side: (1) Minor interface change - pass `Arc<ConsensusOutput>` instead of `&ConsensusOutput` from nw executor - this will allow to caller to cheaply clone ConsensusOutput when it needs to pass it to verification thread
This PR refactors ConsensusHandler in order to prepare for parallel verification of the transaction signatures.
On Sui side:
(1) Break down consensus handling into two independent functions on the
AuthorityState
- one for (potentially parallel) verification and one for sequential handling of the transaction.(2) Move
ConsensusHandler
into separate file -authority.rs
is growing big and parallel signature verification will add even more code to ConsensusHandler.(3) Abolish
NarwhalHandlerError
- after separatinghandle_consensus_transaction
andverify_consensus_transaction
,ConsensusHandler
can now decide what to do based on what function has failed. Note that in practice nothing has changed in how errors are processed by this PR - previously we were mapping errors fromverify_narwhal_transaction
intoNarwhalHandlerError::Skip
, and now we just call it directly fromConsensusHandler
.(4) Two utilization metrics on the
ConsensusHandler
side, namelyverify_narwhal_transaction_duration_mcs
andhandle_consensus_duration_mcs
slightly changed what they measure - previously verification metric was fraction of handle_consensus timer, now they are two independent steps. The total consensus utilization can now be measured as sum of those two metrics (until parallel verification is introduced).On Narwhal side:
(1) Minor interface change - pass
Arc<ConsensusOutput>
instead of&ConsensusOutput
from the narwhal executor - this will allow caller to cheaply clone Arc when it needs to pass it to verification thread