Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve block processing performance during re-org #2805

Open
michaelsproul opened this issue Nov 12, 2021 · 0 comments
Open

Improve block processing performance during re-org #2805

michaelsproul opened this issue Nov 12, 2021 · 0 comments
Labels
A1 major-task A significant amount of work or conceptual task. optimization Something to make Lighthouse run more efficiently.

Comments

@michaelsproul
Copy link
Member

Description

Consider the following re-org that frustrates Lighthouse's attempts to process blocks quickly:

Let n be a slot on an epoch boundary (n % 32 == 0).

  1. Immediately prior to slot n the preemptive state advance occurs as normal
  2. The block from slot n arrives super late (12s+), consuming the advanced state
  3. The block from slot n + 1 arrives on time, but builds upon the parent at slot n -1. It's going to be super slow to process because its parent state is missing from the cache, meaning:
    a) We need to load the full state for slot n - 1 from disk (a few hundred ms)
    b) We need to transition that state through an epoch boundary (200ms)
    c) We need to store the state for slot n on disk. It is different from the slot n slot with block n applied, and presently we store every epoch boundary state

Example

Here's an instance of this behaviour that I observed at slot n=2485472 on mainnet, resulting in block processing taking 2.5s instead of the usual 80ms (median) or 456ms (99th percentile) (metrics from sigp/lighthouse-metrics#31).

Nov 11 16:55:03.815 WARN Beacon chain re-org                     reorg_distance: 1, new_slot: 2485473, new_head: 0xb17f…a572, new_head_parent: 0x0f98…9b22, previous_slot: 2485472, previous_head: 0x1c4d…c94b, service: beacon
Nov 11 16:55:03.818 DEBG Delayed head block                      set_as_head_delay: Some(222.219889ms), imported_delay: Some(2.545278411s), observed_delay: Some(2.051036927s), block_delay: 4.818535227s, slot: 2485473, proposer_index: 52065, block_root: 0xb17fe52ce55315713a9e3eb28858a1a53039daf9e1f6406aa2c8d0d8ae11a572, service: beacon

Even though the block arrived on time, taking 2.5s to process it meant that any attestations at this slot would have missed (if running on this node).

Additional Info

It should be noted that this behaviour should be quite rare, due to the infrequency of re-orgs and late blocks on mainnet (at worst ~4% of blocks are late, with very few being 12s+ late). However if proposer boosting is adopted we may see more re-orgs of this type, where a proposer intentionally orphans the previous block despite it having been published.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A1 major-task A significant amount of work or conceptual task. optimization Something to make Lighthouse run more efficiently.
Projects
None yet
Development

No branches or pull requests

1 participant