Skip to content

Stop batch verification lagging and failing entire batches #4729

@teor2345

Description

Motivation

The batch verifier can lag and fail entire blocks, even if the proofs and signatures are valid. We should re-design it so it can't lag.

If we can't do that, the dropped verifications should fail immediately, rather than waiting for the block verification timeout:

2022-06-30T17:51:02.943983ZERROR{net="Main"}:sync:try_to_sync:extend_tips:zebra_consensus::primitives::redpallas: batch verification receiver lagged and lost verification results
2022-06-30T17:51:57.423166Z INFO{net="Main"}:zebrad::components::sync::progress: estimated progress to chain tip sync_percent=99.819% current_height=Height(1718380) network_upgrade=Nu5 remaining_sync_blocks=3119 time_since_last_state_block=1m
...
2022-06-30T17:54:57.425405Z INFO{net="Main"}:zebrad::components::sync::progress: estimated progress to chain tip sync_percent=99.819% current_height=Height(1718380) network_upgrade=Nu5 remaining_sync_blocks=3122 time_since_last_state_block=4m
2022-06-30T17:55:57.426619Z INFO{net="Main"}:zebrad::components::sync::progress: estimated progress to chain tip sync_percent=99.819% current_height=Height(1718381) network_upgrade=Nu5 remaining_sync_blocks=3121 time_since_last_state_block=0s
2022-06-30T17:56:47.835664Z WARN{net="Main"}:sync:try_to_sync:zebrad::components::sync: error downloading and verifying block e=Invalid { error: Block(Transaction(InternalDowncastError("downcast to known transaction error type failed, original error: Elapsed(())"))), height: Height(1718445), hash: block::Hash("0000000000f0f61e6f42984784ad367711c0b3e704e840797606314426dc2a90") }

https://github.com/ZcashFoundation/zebra/runs/7135923149?check_suite_focus=true#step:6:644

Designs

We can either:

  • replace the batch verifier with a watch channel, and create a new watch channel for each batch
  • create a new broadcast channel for each batch, so it only ever has one result in it, and make the channel size 1

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

C-bugCategory: This is a bugI-slowProblems with performance or responsivenessS-needs-investigationStatus: Needs further investigation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions