Skip to content

Backward sync stuck in a loop #6749

Closed
Closed
@pinges

Description

Description

The backward sync (BWS) is getting stuck in a loop with these recurring log messages:

{"@timestamp":"2024-03-05T13:57:59,199","level":"INFO","thread":"EthScheduler-Timer-0","class":"BackwardSyncContext","message":"Current backward sync session failed, it will be restarted","throwable":""}
{"@timestamp":"2024-03-05T13:58:01,114","level":"INFO","thread":"vert.x-worker-thread-0","class":"BackwardSyncContext","message":"Starting a new backward sync session","throwable":""}

Enough peers are present.

Restarting the node fixes the problem.

Reason

We receive an fcu containing the block hash of a head block. This block is added to the hashesToAppend queue. The block get's reorged and when we try to retrieve that block in the BWS from out peers none of them is able to provide it to us. This causes the BWS to fail, and when we receive the next fcu, a new hash might be added to the queue, but a new BWS will be started, trying to retrieve the same block that we have unsuccessfully tried to retrieve before.

This happened on 7 out of 8 nodes I started based on 24.2.0-RC4:
dev-elc-besu-teku-mainnet-dev-stefan-rc4-(1,2,4)
dev-elc-besu-teku-mainnet-dev-stefan-ss-(1,2,3,4)

The block reorged had the hash 0x4550b82492bf1738af79efb6140770c5443d368b9512ae8551583909554a040f.

Link to Kibane should work for about another 3 weeks:
Kibana: https://kibana.dev.protocols.consensys.net/app/r?l=DISCOVER_APP_LOCATOR&v=8.11.0&lz=N4IgjgrgpgTgniAXKSsGJANwLYH0B2AhtlIgDogAmUmAtFADYDGtARlAM4S0AuUA1t2yEAlvnxQetanQ58AZoXy0KAAiWVVJDh0IBzUhQAMADwAsAVgtHWADgBMZgJz3W8gIwB2AMy3C8zycoeVYANnczI09PIyYLMzNvSm9Q21YnC3d7QihbK3cLW28nIwz4wiNI%2BTUQABoQBiU9CH0oJBBBNBAAX3qOAHsYHiQAbRGQAAEeEW0eYgAHOqpOJhAAXTX6pn6GCGx8DlGsPCISJe1dA3X6sWoTdvsCs3t7eSdaIqLaS1DKWicnN4WEF5IlvEZKJ43GYlmI%2BDBMIQGO1CBAeP0lvIRAx4YdECNNlRCHMAGoiKAAdwAkpQHk8Xm8Pr5vN8LL9%2FoDgcEwRCoaCltMSAAlJptZAgeQwfrYdr4foU2jgygAelp9XRsvlPXqMGCuo4AAsqfh4YjkeKzdAkKEjLajPV5qiOGKeDBoN1ukA%3D%3D

Node rc4-1 has been restarted and finished syncing successfully.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

P2High (ex: Degrading performance issues, unexpected behavior of core features (DevP2P, syncing, etc))bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions