Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove blocking rpc and use a separate tokio runtime for jsonrpc server #10913

Merged
merged 10 commits into from
Apr 22, 2023

Conversation

patrickkuo
Copy link
Contributor

@patrickkuo patrickkuo commented Apr 14, 2023

  • reverted marking RPC methods as "blocking"
  • using separate tokio runtime for the RPC server, worker size is defaulted to 1/2 of the CPU core, can be changed using RPC_WORKER_THREAD env variable.
  • use tokio::spawn for some of the long running rocksdb calls to unblock the request handling task.

@vercel
Copy link

vercel bot commented Apr 14, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

4 Ignored Deployments
Name Status Preview Comments Updated (UTC)
explorer ⬜️ Ignored (Inspect) Apr 22, 2023 1:44am
explorer-storybook ⬜️ Ignored (Inspect) Apr 22, 2023 1:44am
sui-wallet-kit ⬜️ Ignored (Inspect) Apr 22, 2023 1:44am
wallet-adapter ⬜️ Ignored (Inspect) Apr 22, 2023 1:44am

@patrickkuo patrickkuo force-pushed the pat/remove_rpc_blocking branch 6 times, most recently from 71c44dc to 8beb078 Compare April 18, 2023 00:28
@patrickkuo patrickkuo changed the base branch from main to releases/sui-v0.32.0-release April 18, 2023 00:47
@patrickkuo patrickkuo force-pushed the pat/remove_rpc_blocking branch from 7d74669 to b159836 Compare April 18, 2023 17:20
.collect::<Vec<_>>();
let state = self.state.clone();

let mut data = spawn_monitored_task!(async move {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason to do spawn_monitored_task(...).await here and in similar places? The only effect of this is that the read request will continue to execute if RPC is cancelled, but I can't think why would we want this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, the rocksdb call will block the jsonrpsee handle_request, this causes the RPC server to not respond to any new request until it complete.

@patrickkuo patrickkuo force-pushed the pat/remove_rpc_blocking branch 6 times, most recently from 52823c2 to a4e7654 Compare April 20, 2023 02:58
@github-actions github-actions bot added the Type: Documentation Improvements or additions to documentation label Apr 20, 2023
@patrickkuo patrickkuo changed the base branch from releases/sui-v0.32.0-release to main April 20, 2023 03:14
@patrickkuo patrickkuo force-pushed the pat/remove_rpc_blocking branch from a4e7654 to 15b5eef Compare April 20, 2023 11:28
@github-actions github-actions bot removed the Type: Documentation Improvements or additions to documentation label Apr 20, 2023
@patrickkuo patrickkuo force-pushed the pat/remove_rpc_blocking branch 3 times, most recently from d67b8c3 to 4fe597f Compare April 21, 2023 12:12
@patrickkuo patrickkuo marked this pull request as ready for review April 21, 2023 15:16
@patrickkuo patrickkuo changed the title remove blocking rpc remove blocking rpc and use a separate Tokyo runtime for jsonrpc server Apr 21, 2023
@patrickkuo patrickkuo requested review from 666lcz, oxade and andll April 21, 2023 15:17
@patrickkuo patrickkuo changed the title remove blocking rpc and use a separate Tokyo runtime for jsonrpc server remove blocking rpc and use a separate tokio runtime for jsonrpc server Apr 21, 2023
@patrickkuo patrickkuo force-pushed the pat/remove_rpc_blocking branch from b5885ab to 1e97d38 Compare April 21, 2023 20:10
patrickkuo added a commit that referenced this pull request Apr 21, 2023
Copy link
Contributor

@longbowlu longbowlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patrickkuo have we tried to move all the spawn to the jsonrpc layer? I'm a bit concerned the changes in authority.rs could impact validator performance. It could be in a bad or even good way, but hard to tell without experimenting first.

if obj_ref.2.is_alive() {
match database.get_object_by_key(&object_id, obj_ref.1)? {
None => {
error!("Object with in parent_entry is missing from object store, datastore is inconsistent");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we log the ObjectID here

@patrickkuo
Copy link
Contributor Author

@patrickkuo have we tried to move all the spawn to the jsonrpc layer? I'm a bit concerned the changes in authority.rs could impact validator performance. It could be in a bad or even good way, but hard to tell without experimenting first.

sure, I will limit the change to RPC layer

@patrickkuo patrickkuo marked this pull request as draft April 21, 2023 23:53
@patrickkuo patrickkuo marked this pull request as ready for review April 22, 2023 02:40
@patrickkuo patrickkuo requested a review from longbowlu April 22, 2023 17:45
@patrickkuo patrickkuo merged commit 4b2b5c7 into main Apr 22, 2023
@patrickkuo patrickkuo deleted the pat/remove_rpc_blocking branch April 22, 2023 19:05
sadhansood pushed a commit that referenced this pull request Apr 22, 2023
…er (#10913)

* reverted marking RPC methods as "blocking"
* using separate tokio runtime for the RPC server, worker size is
defaulted to 1/2 of the CPU core, can be changed using
`RPC_WORKER_THREAD` env variable.
* use tokio::spawn for some of the long running rocksdb calls to unblock
the request handling task.
666lcz added a commit that referenced this pull request Apr 27, 2023
## Description 

Load testing shows that there's some perf regression in
`multi_get_transaction_blocks` after
#10913 because some of the
blocking queries such as computing object/balance changes are not
wrapped in spawn_async. This also impacts the perf for
`query_transaction_blocks` which also uses
`multi_get_transaction_blocks` under the hood.

## Next steps
Audit rest of the codebase and make sure all blocking calls are wrapped
in spawn_async

## Test Plan 

Tested against loadgen client and observed that performance regression
is gone

---
If your changes are not user-facing and not a breaking change, you can
skip the following section. Otherwise, please indicate what changed, and
then add to the Release Notes section as highlighted during the release
process.

### Type of Change (Check all that apply)

- [ ] user-visible impact
- [ ] breaking change for a client SDKs
- [ ] breaking change for FNs (FN binary must upgrade)
- [ ] breaking change for validators or node operators (must upgrade
binaries)
- [ ] breaking change for on-chain data layout
- [ ] necessitate either a data wipe or data migration

### Release notes
666lcz added a commit that referenced this pull request Apr 27, 2023
## Description 

Load testing shows that there's some perf regression in
`multi_get_transaction_blocks` after
#10913 because some of the
blocking queries such as computing object/balance changes are not
wrapped in spawn_async. This also impacts the perf for
`query_transaction_blocks` which also uses
`multi_get_transaction_blocks` under the hood.

## Next steps
Audit rest of the codebase and make sure all blocking calls are wrapped
in spawn_async

## Test Plan 

Tested against loadgen client and observed that performance regression
is gone

---
If your changes are not user-facing and not a breaking change, you can
skip the following section. Otherwise, please indicate what changed, and
then add to the Release Notes section as highlighted during the release
process.

### Type of Change (Check all that apply)

- [ ] user-visible impact
- [ ] breaking change for a client SDKs
- [ ] breaking change for FNs (FN binary must upgrade)
- [ ] breaking change for validators or node operators (must upgrade
binaries)
- [ ] breaking change for on-chain data layout
- [ ] necessitate either a data wipe or data migration

### Release notes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants