-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove blocking rpc and use a separate tokio runtime for jsonrpc server #10913
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ |
71c44dc
to
8beb078
Compare
7d74669
to
b159836
Compare
.collect::<Vec<_>>(); | ||
let state = self.state.clone(); | ||
|
||
let mut data = spawn_monitored_task!(async move { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason to do spawn_monitored_task(...).await
here and in similar places? The only effect of this is that the read request will continue to execute if RPC is cancelled, but I can't think why would we want this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this, the rocksdb call will block the jsonrpsee handle_request, this causes the RPC server to not respond to any new request until it complete.
52823c2
to
a4e7654
Compare
a4e7654
to
15b5eef
Compare
d67b8c3
to
4fe597f
Compare
b5885ab
to
1e97d38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@patrickkuo have we tried to move all the spawn
to the jsonrpc layer? I'm a bit concerned the changes in authority.rs could impact validator performance. It could be in a bad or even good way, but hard to tell without experimenting first.
crates/sui-core/src/authority.rs
Outdated
if obj_ref.2.is_alive() { | ||
match database.get_object_by_key(&object_id, obj_ref.1)? { | ||
None => { | ||
error!("Object with in parent_entry is missing from object store, datastore is inconsistent"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we log the ObjectID here
sure, I will limit the change to RPC layer |
…er (#10913) * reverted marking RPC methods as "blocking" * using separate tokio runtime for the RPC server, worker size is defaulted to 1/2 of the CPU core, can be changed using `RPC_WORKER_THREAD` env variable. * use tokio::spawn for some of the long running rocksdb calls to unblock the request handling task.
## Description Load testing shows that there's some perf regression in `multi_get_transaction_blocks` after #10913 because some of the blocking queries such as computing object/balance changes are not wrapped in spawn_async. This also impacts the perf for `query_transaction_blocks` which also uses `multi_get_transaction_blocks` under the hood. ## Next steps Audit rest of the codebase and make sure all blocking calls are wrapped in spawn_async ## Test Plan Tested against loadgen client and observed that performance regression is gone --- If your changes are not user-facing and not a breaking change, you can skip the following section. Otherwise, please indicate what changed, and then add to the Release Notes section as highlighted during the release process. ### Type of Change (Check all that apply) - [ ] user-visible impact - [ ] breaking change for a client SDKs - [ ] breaking change for FNs (FN binary must upgrade) - [ ] breaking change for validators or node operators (must upgrade binaries) - [ ] breaking change for on-chain data layout - [ ] necessitate either a data wipe or data migration ### Release notes
## Description Load testing shows that there's some perf regression in `multi_get_transaction_blocks` after #10913 because some of the blocking queries such as computing object/balance changes are not wrapped in spawn_async. This also impacts the perf for `query_transaction_blocks` which also uses `multi_get_transaction_blocks` under the hood. ## Next steps Audit rest of the codebase and make sure all blocking calls are wrapped in spawn_async ## Test Plan Tested against loadgen client and observed that performance regression is gone --- If your changes are not user-facing and not a breaking change, you can skip the following section. Otherwise, please indicate what changed, and then add to the Release Notes section as highlighted during the release process. ### Type of Change (Check all that apply) - [ ] user-visible impact - [ ] breaking change for a client SDKs - [ ] breaking change for FNs (FN binary must upgrade) - [ ] breaking change for validators or node operators (must upgrade binaries) - [ ] breaking change for on-chain data layout - [ ] necessitate either a data wipe or data migration ### Release notes
RPC_WORKER_THREAD
env variable.