Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

polygon/heimdall: fix snapshot store RangeFromBlockNum #13689

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

taratorio
Copy link
Member

@taratorio taratorio commented Feb 4, 2025

fixes #13674

issue was that after rm -rf chaindata:

  1. txEntityStore.EntityIdFromBlockNum returns an entity id (in this case a Checkpoint entity id) whose corresponding entity is no longer in the DB (in this case erigon/datadir/heimdall/mdbx.dat) because it has been retired into its corresponding snapshot file
  2. then, in turn, txEntityStore.RangeFromBlockNum returns only the Checkpoint entities which it has in the DB (they are all with Start block num greater than the one in 1) - this creates a gap of X number of checkpoints
  3. then, in turn, when we download all the corresponding blocks for the checkpoints in 2) we hit a pos sync failed: block gap inserted error which is because of the gap in checkpoints between start of downloading new blocks and start of first checkpoint in 2)

Managed to reproduce the above by:

  • deleting last 10k worth of snapshots for headers.seg, bodies.seg, transactions.seg and corresponding .idx files
  • rm -rf chaindata
  • adding the pos sync failed: unexpected first checkpoint with id 15274 has start 17673843 which is greater than download start 17660000 err check and hitting it after the above 2 steps

@taratorio
Copy link
Member Author

wait until #13702 is merged and issue with bor checkpoints snapshots generation is resolved (needs a bug fix and regeneration of snapshots)

mh0lt pushed a commit that referenced this pull request Feb 5, 2025
#13702)

relates to #13689 and
#13674

seems like we have an issue with our checkpoints snapshot generation as
gaps are discovered
@taratorio
Copy link
Member Author

taratorio commented Feb 5, 2025

need to re-test once we fix snapshot generation logic for borcheckpoints and bormilestones to avoid gaps

@taratorio
Copy link
Member Author

taratorio commented Feb 5, 2025

e.g. problem is in a gap like this (FirstInDb and LastInSnapshots should be connecting)

entityId_FromBlockNum: 15265
entityId_FirstInDb: 15274
entityId_LastInSnapshotsInReverse:15261 kind=*heimdall.Checkpoint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Astrid: better support rm -rf chaindata case
2 participants