Skip to content

Fix repeated block timeouts during initial sync #5709

Open

Description

Motivation

During the initial sync, Zebra's block validation repeatedly times out, and the syncer resets:

2022-11-23T09:22:15.197732Z WARN {net="Main"}:sync:try_to_sync: zebrad::components::sync: error downloading and verifying block e=ValidationRequestError { error: Elapsed(()), height: Height(1856228), hash: block::Hash("0000000000d433e30155445ddc72854eb7395827c1c241b249a255585ea2716a") }

https://github.com/ZcashFoundation/zebra/actions/runs/3519147385/jobs/5923785413#step:8:12498

This only happens in some parts of the chain, and it seems to happen at slightly different block heights each time. When it does happen, it can happen repeatedly every 10-15 minutes:
https://github.com/ZcashFoundation/zebra/actions/runs/3519147385/jobs/5923785413#step:8:12465

This is not a priority because Zebra is still much faster than zcashd.

Designs

This could be happening because:

  • blocks are committed out of order
  • we're downloading blocks in the wrong order (not strict height order) then hitting a concurrency limit so we can't download the rest before the timeout
  • the block timeout is too short for the number of blocks allowed in the validate and commit pipeline
    • decreasing the pipeline capacity also reduces RAM use, but makes runs of small blocks download and verify slower
    • increasing the timeout increases the amount of time we wait for blocks that will never verify, increasing RAM usage
    • we tried increasing the timeout and it didn't work, but a slight increase might be ok
  • our state isn't responding to all the UTXOs for the blocks it has (I think this is a known issue)
  • the verifier is blocked in some other way

Related Work

If we speed up block commits, it might solve this issue:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    A-networkArea: Network protocol updates or fixesA-stateArea: State / database changesC-bugCategory: This is a bugI-slowProblems with performance or responsivenessI-usabilityZebra is hard to understand or useS-needs-triageStatus: A bug report needs triage

    Type

    Projects

    • Status

      New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions