Skip to content

Conversation

@heifner
Copy link
Contributor

@heifner heifner commented Jul 14, 2025

Reverts #1676. A recent test failure #1716 demonstrates the need for the reset. The test failed because it timed out waiting for LIB to advance. The node did actually advance LIB but not in the expected 30 second timeout. The issue was that an unlinkable block at transition from LIB catchup to HEAD catchup didn't reset sync and therefore the node didn't know it needed to continue syncing. The handshake was not sufficient because the other node thought it had caught up. This was exasperated by the node syncing was in IRREVERSIBLE mode. It is possible that we could live with only resetting when running in IRREVERSIBLE mode. However, I think the safest and most direct approach is to go back to resetting the sync, but guard against resetting the sync when know we have already reset the sync. Therefore, fixing #1673 in a different way.

Resolves #1716
Resolves #1673

@heifner heifner requested review from greg7mdp and spoonincode July 14, 2025 17:59
@heifner heifner added the OCI Work exclusive to OCI team label Jul 14, 2025
@heifner heifner linked an issue Jul 14, 2025 that may be closed by this pull request
{
// reset sync on rejected block
fc::lock_guard g( sync_mtx );
if (sync_last_requested_num != 0 && blk_num <= sync_next_expected_num-1) { // no need to reset if we already reset and are syncing again
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not clear to me why this check indicates that we already reset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say you are syncing and blocks 50-100.

  • Block 65 is rejected.
    • Reset sync_last_requested_num = 0
    • sync_next_expected_num = 60 // something less <= 65 since set to LIB+1
  • Block 66 is rejected because 65 just was and 66 can't link to anything
    • 66 <= 59 is not true, so don't reset
    • Same for 67-100.

@heifner heifner merged commit ac2b0c5 into release/1.2 Jul 15, 2025
36 checks passed
@heifner heifner deleted the GH-1716-p2p-irreversible-sync-1.2 branch July 15, 2025 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCI Work exclusive to OCI team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test failure: nodeos_read_terminate_at_block_if_lr_test

4 participants