Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-process shared-objects execution #2059

Merged
merged 10 commits into from
May 19, 2022
Merged

Multi-process shared-objects execution #2059

merged 10 commits into from
May 19, 2022

Conversation

asonnino
Copy link
Contributor

@asonnino asonnino commented May 18, 2022

  • Multicore execution of shared-object transactions
  • Speed up unit tests that use Narwhal
  • Fix underminist failure of a few unit tests

A bit of context on how this work. Upon receiving a shared-object certificate, the Sui authority first calls try_skip_consensus to see if locks already exist and it can directly re-attempt execution (without going through consensus). If it can't, it sends the transaction through consensus. After sequencing, consensus calls asign_shared_locks to assign shared locks to every shared-object in the certificate. Note that there is only one task running asign_shared_locks (this is safety-critical). Finally, the certificate returns to the Sui authority who calls execute_shared_transaction to attempt execution. Note that the order of the checks in the functions below are important to avoid race-conditions. Further, the shared-objects logic assumes that every shared-object transaction updates at least one owned-transaction (the gas object); this avoids a number of race conditions upon committing the transaction effects to storage.

[1 task per connection] try_skip_consensus:
1. If effects { return info }
2. locks_exist = Check shared-locks exist
3. If locks_exist { Check shared-locks; then atomically execute & remove shared-locks }

[only run by 1 task] asign_shared_locks:
1. locks_exist = Check shared-locks exist
2. If effects { return info }
3. If !locks_exist { Assign locks }

[1 task per connection] execute_shared_transaction:
1. If effects { return info }
2. Check shared-locks; then atomically execute & remove shared-locks

@asonnino asonnino self-assigned this May 18, 2022
@asonnino asonnino added Priority: High Very important task, not blocking but potentially delaying milestones or limiting our offering Status: Needs Review PR is ready for review sui-node test Type: Enhancement New feature or request labels May 18, 2022
@asonnino
Copy link
Contributor Author

CC @mystenmark to coordinate with #1953

@asonnino asonnino changed the title Multicore Multi-process shared-objects execution May 18, 2022
@asonnino asonnino added Priority: Medium A nice to have feature or annoying bug, non-blocking and no delays expected if we punt on it and removed Priority: High Very important task, not blocking but potentially delaying milestones or limiting our offering labels May 18, 2022
@@ -35,7 +35,7 @@ pub struct ValidatorConfig {
network_address: Multiaddr,
metrics_address: Multiaddr,

consensus_config: ConsensuseConfig,
pub consensus_config: ConsensuseConfig,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these don't need to be made pub as there are accessors implemented below for that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to modify them to not use the default values (it was one of the reasons why our tests are so slow)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we need a constructor for narwhal configs, but I figured this could be made in a separate PR (the config structure is pretty complex)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Yeah this is something that we can fix later (enabling configuration generation with a base template)

@oxade oxade self-requested a review May 18, 2022 23:40
@asonnino asonnino enabled auto-merge (squash) May 18, 2022 23:43
@asonnino asonnino added this to the Pre Testnet milestone May 18, 2022
@asonnino asonnino linked an issue May 18, 2022 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: Medium A nice to have feature or annoying bug, non-blocking and no delays expected if we punt on it Status: Needs Review PR is ready for review sui-node test Type: Enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi-task shared-object execution
2 participants