## Description
Currently, certificates fetched during catchup are processed and
accepted one-by-one, same as normal certificate handling. This turns out
to be a throughput bottleneck for catching up. This PR includes one
major and a few minor changes to improve the catchup throughput.
Main change: take the lock on `state` once per batch of certificates in
`process_certificate_with_lock()`, instead of once per certificate.
`tokio::sync::Mutex` seems to be really slow as the lock on `state`,
taking 5 ~ 10s for 2000 lock operations.
Other changes:
- Increase channel size in Narwhal across the board from 1k to 10k, to
avoid filling the channels too often. I did not see very noticeable
memory usage increase.
- Avoid using blocking thread for verifying user signatures in
transactions, which should be relatively fast.
- Cleanups.
## Test Plan
### Private testnet
In catchup experiments with 5000 TPS and 150 validators, this seems to
improve catchup speed from ~2/round to ~8/round.
Before:
Catching up after 1 hr of downtime never finished within the epoch:


After:
Catching up after 1 hr of downtime took ~20 min:


---
If your changes are not user-facing and not a breaking change, you can
skip the following section. Otherwise, please indicate what changed, and
then add to the Release Notes section as highlighted during the release
process.
### Type of Change (Check all that apply)
- [ ] protocol change
- [ ] user-visible impact
- [ ] breaking change for a client SDKs
- [ ] breaking change for FNs (FN binary must upgrade)
- [ ] breaking change for validators or node operators (must upgrade
binaries)
- [ ] breaking change for on-chain data layout
- [ ] necessitate either a data wipe or data migration
### Release notes