Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepose consensus commit prologue in checkpoint #18023

Merged

Conversation

halfprice
Copy link
Contributor

@halfprice halfprice commented Jun 1, 2024

Description

This PR moves consensus commit prologue to the beginning of the checkpoint if it presents. This is to make sure
that the consensus commit prologue is before all the cancelled transactions in the checkpoint, and hence
when doing sequential replay, we know which transactions are cancelled.

This is achieved by finding the consensus commit prologue in the pending checkpoint, and always
places it the first before all other sorted transactions.

It also builds on several assumptions:

  • There is at most one consensus commit prologue transactions in a pending checkpoint.
  • The consensus commit prologue shouldn't have any dependencies that aren't in any previous checkpoints.

Test plan

Added consensus commit prologue invariant check for all integration simtests.


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:

Copy link

vercel bot commented Jun 1, 2024

@halfprice is attempting to deploy a commit to the Mysten Labs Team on Vercel.

A member of the Team first needs to authorize it.

Copy link

vercel bot commented Jun 1, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 12, 2024 4:12am


// No other dependencies of this consensus commit prologue that haven't been included
// in any previous checkpoint.
assert_eq!(effects_in_current_checkpoint.len(), 1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debating whether these should be debug_assert or assert, and eventually decides to make it assert. This is because all this PR is built on the assumption that there is at most one consensus commit prologue in a pending checkpoint, and all its previous dependencies must have been included in the previous checkpoint. If this is true, something must be wrong and we probably don't want the node to move forward to produce more false state.

Comment on lines 975 to 982
let mut effects_in_current_checkpoint = BTreeSet::new();
let sorted_tx_effects_included_in_checkpoint = self
.resolve_checkpoint_transactions(pending.roots, &mut effects_in_current_checkpoint)
.await?;
let new_checkpoint = self
.create_checkpoints(sorted_tx_effects_included_in_checkpoint, pending.details)
.await?;
self.write_checkpoints(height, new_checkpoint).await?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is structure this way primarily due to easy integration with @aschran 's #17955 .

If this lends before that one, we just need to put

let sorted_tx_effects_included_in_checkpoint = self
    .resolve_checkpoint_transactions(pending.roots, &mut effects_in_current_checkpoint)
    .await?;

into a loop over all the pending checkpoints, and concatenate all the sorted_tx_effects_included_in_checkpoint

Comment on lines 2601 to 2675
if let Some(consensus_commit_prologue_root) = consensus_commit_prologue_root {
if self
.protocol_config()
.prepose_consensus_commit_prologue_in_checkpoints()
{
// Put consensus commit prologue root at the beginning of the checkpoint roots.
checkpoint_roots.push(consensus_commit_prologue_root);
} else {
roots.insert(consensus_commit_prologue_root);
}
}
Copy link
Contributor Author

@halfprice halfprice Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, this part does not necessarily need to be guarded by the feature flag, because the causal sort in checkpoint builder is supposed to be input order agnostic. However, I'm scared that if the order here somehow affects the causal sort order in checkpoint builder, it would cause fork when upgrading. So I still guarded preposing the consensus commit prologue here using the feature flag to be cautious.

Copy link
Contributor

@mystenmark mystenmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with two nits

}
sorted.extend(CausalOrder::causal_sort(unsorted));

fail_point_if!("assert-prologue-first-in-checkpoint", || {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this could just be if cfg!(msim) instead of a failpoint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Ok(self
.state
.get_transaction_cache_reader()
.get_transaction_block(&root_digests[0])?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you expect here instead of using filter? if the transaction block is not found it is a serious bug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean remove the "?" instead of using an expect, correct? If so, done.

I don't think we can remove the filter since it's possible that a consensus commit doesn't contain a prologue txn.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok looks good now

@halfprice halfprice merged commit 7e33c3f into MystenLabs:main Jun 12, 2024
39 of 43 checks passed
@halfprice halfprice deleted the zhewu/prologue-first-in-checkpoint branch June 12, 2024 04:40
tx-tomcat pushed a commit to tx-tomcat/sui-network that referenced this pull request Jul 29, 2024
## Description 

This PR moves consensus commit prologue to the beginning of the
checkpoint if it presents. This is to make sure
that the consensus commit prologue is before all the cancelled
transactions in the checkpoint, and hence
when doing sequential replay, we know which transactions are cancelled.

This is achieved by finding the consensus commit prologue in the pending
checkpoint, and always
places it the first before all other sorted transactions.

It also builds on several assumptions:
- There is at most one consensus commit prologue transactions in a
pending checkpoint.
- The consensus commit prologue shouldn't have any dependencies that
aren't in any previous checkpoints.

## Test plan 

Added consensus commit prologue invariant check for all integration
simtests.

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] Indexer: 
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
Comment on lines +1105 to +1115
if let Some((ccp_digest, ccp_effects)) = consensus_commit_prologue {
#[cfg(debug_assertions)]
{
// When consensus_commit_prologue is extracted, it should not be included in the `unsorted`.
for tx in unsorted.iter() {
assert!(tx.transaction_digest() != &ccp_digest);
}
}
sorted.push(ccp_effects);
}
sorted.extend(CausalOrder::causal_sort(unsorted));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@halfprice fyi this line threw a warning for me when i was compiling this morning:

Screenshot 2024-11-29 at 3 05 12 AM

Suggested change
if let Some((ccp_digest, ccp_effects)) = consensus_commit_prologue {
#[cfg(debug_assertions)]
{
// When consensus_commit_prologue is extracted, it should not be included in the `unsorted`.
for tx in unsorted.iter() {
assert!(tx.transaction_digest() != &ccp_digest);
}
}
sorted.push(ccp_effects);
}
sorted.extend(CausalOrder::causal_sort(unsorted));
if let Some((_ccp_digest, ccp_effects)) = consensus_commit_prologue {
#[cfg(debug_assertions)]
{
// When consensus_commit_prologue is extracted, it should not be included in the `unsorted`.
for tx in unsorted.iter() {
assert!(tx.transaction_digest() != &_ccp_digest);
}
}
sorted.push(ccp_effects);
}
sorted.extend(CausalOrder::causal_sort(unsorted));

Not sure if this would fix the warning or if it even makes sense to fix?...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants