Skip to content

Tracking: Commit blocks to state using a separate task #4937

Closed
@teor2345

Description

Motivation

Zebra takes 10-15 minutes to commit some blocks to the state while checkpointing, around blocks 1,718,00 to 1,772,000.

The slow blocks are different on different runs.

This is unacceptable performance, because:

  • it's much slower than zcashd
  • Zebra will appear to hang for 15 minutes, which is a usability and security issue
  • it causes warnings in the Zebra logs
  • if it's remotely triggerable, it could be a denial of service risk

Diagnosis

Zebra queues up to 1200 blocks, then commits them all in the same state request, after the missing block arrives. This can take up to 10 seconds per block.

Design

Add a block commit task to the state, which runs in a separate thread. The task should be between the block queue and the block verifier.

We'll need to move the shared mutable chain state into the block commit task, so we will also need to redirect StateService read requests to the concurrent ReadStateService.

Here is a diagram of the new state design:
https://docs.google.com/drawings/d/1FXpAUlenDAjl8nkftrypdAPsj0jr-Ut9gZlSP57nuyc/edit

Implementation Plan

Stop Accessing Mutable Chain State

Set Up Channels

Setup Block Commit task

  • Add a new block commit task with unused channels

Add channels to send blocks to the task

  • Add a channel that handles finalized state CommitFinalizedBlock requests
  • Add a channel that handles non-finalized state CommitBlock requests
  • We want two channels so we can wait for the last finalized block before committing the first non-finalized block (by height)

Error Handling & Testing

Optional tasks:

Optional Cleanup Tasks

Bug fixes:

Refactors:

  • Make pending_utxos.respond() async using a channel, so we can use ReadRequest::ChainUtxo in AwaitUtxo

Renames & Formatting:

  • Rename every instance of address * or transparent_* to address_*
  • Put the Request and Response enums in a consistent order

In Scope

  • Non-finalized state
  • Finalized state
  • Running the task in a separate thread

Out of Scope

We don't think we'll need to make these changes as part of this change:

These are definitely out of scope:

  • Other state refactors
  • Other performance improvements
  • Note commitment tree performance improvements

Metadata

Labels

A-stateArea: State / database changesC-bugCategory: This is a bugC-securityCategory: Security issuesI-hangA Zebra component stops responding to requestsI-slowProblems with performance or responsivenessI-usabilityZebra is hard to understand or useS-needs-investigationStatus: Needs further investigation

Type

No type

Projects

  • Status

    Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions