Mini-Epic: Stop tokio tasks running for a long time and blocking other tasks #4747
Closed
Description
Motivation
At the moment, Zebra can't sync all the way to the tip, because some tokio tasks run for a long time, and block other tasks.
(It's also possible there are some deadlocks, livelocks, or missed task exits.)
We should discover the specific bugs using tokio-console
, and then open a ticket for each one.
Tasks
Issues that need investigation
- Investigate busiest tasks per tokio-console #4583
- Find out which parts of CommitBlock/CommitFinalizedBlock are slow #4823 and then speed them up
- Add a Future wrapper that times each poll, and logs long polls
CPU usage analysis
-
1. fix(state): Stop reading redundant blocks for every FindHashes and FindHeaders request #4825 & 2. fix(state): Make FindHeaders and FindHashes run concurrently with state updates #4826
Deserialization (in zebra-network
or zebra-state
):
-
sapling::output::OutputPrefixInTransactionV5::zcash_deserialize()
(Move network transaction deserialization to a dedicated blocking and CPU-heavy thread #4787)sapling::committment::ValueCommitment::try_from::<[u8; 32]>()
bellman::groth16::Proof::read()
jubjub::AffinePoint::from_bytes_inner()
bls12_381::scalar::square()
bls12_381::scalar::sqrt()
-
finalized_state::ZebraDb::block()
(Move database block and transaction fetches to a dedicated blocking and CPU-heavy thread #4788)
Verification (in zebra-consensus
):
-
groth16::DescriptionWrapper::try_from()
(Move CPU-heavy proof preparation into the batch cryptography thread #4789)transaction::Verifier::verify_v5_transaction()
Note commitment tree updates (in zebra-state
, either finalized or non-finalized):
- Note commitment tree append and root (Move note commitment tree updates to a dedicated blocking and CPU-heavy thread #4790)
non_finalized_state::chain::UpdateWith
sapling::tree::merkle_crh_sapling()
sapling::commitment::pedersen_hashes::pedersen_hash()
sapling::commitment::pedersen_hashes::pedersen_hash_to_point()
incrementalmerkletree::bridgetree::Frontier::append()
- feat(state): Send treestate from non-finalized state to finalized state #4721 should also help with non-finalized blocks
Fixed Issues
Fixed by #4750:
- Check for
Worker
panics intower_batch::Batch
#4738 - Run groth16, ed25519, and redjubjub batches on a blocking thread #4740
- Stop batch verification lagging and failing entire batches #4729
- Replace batch verifier broadcast channels with watch channels (cleanup only)
- Zebra is slow near the tip #4650
- Lagged inventory advertisements (might also need fix(batch): Improve batch verifier async, correctness, and performance #4750)
- Verification failure after block 1719629 on mainnet (also block 1944652 on testnet is very slow)
Metadata
Assignees
Labels
Category: This is a bugCategory: Security issuesCategory: This is a tracking issue for other tasksZenhub Label. Denotes a theme of work under which related issues will be groupedZebra breaks a Zcash consensus ruleA Zebra component stops responding to requestsContinuous integration fails, including build and test failures
Activity