Skip to content

Commit 4791706

Browse files
algonautshantonetechnical
authored andcommitted
catchup: suspend the catchup session once the agreement service kicks in (#3299)
The catchup service stops when it is complete, i.e. it has reached up to the round which is being agreed on. The catchup service knows it is complete and should stop, when it finds that a block is in the ledger before it adds it. In other words, apart from the catchup, only the agreement adds blocks to the ledger. And when the agreement adds a block to the ledger before the catchup, it means the agreement is ahead, and the catchup is complete. When `fetchAndWrite` detects the block is already in the ledger, it returns. The return value of `false` stops the catchup syncing. In previous releases, `fetchAndWrite` was only checking if the block is already in the ledger after attempting to fetch it. Since it fails to fetch a block not yet agreed on, the fetch fails after multiple attempts, and `fetchAndWrite` returns `false` ending the catchup. A recent change made this process more efficient by first checking if the block is in the ledger before/during the fetch. However, once the block was found in the ledger, `fetchAndWrite` returned true instead of false (consistent with already existing logic since forever, which was also wrong). This caused the catchup to continue syncing after catchup was complete. This change fixes the return value from true to false.
1 parent 4860375 commit 4791706

File tree

1 file changed

+9
-5
lines changed

1 file changed

+9
-5
lines changed

catchup/service.go

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -241,8 +241,10 @@ func (s *Service) fetchAndWrite(r basics.Round, prevFetchCompleteChan chan bool,
241241

242242
if err != nil {
243243
if err == errLedgerAlreadyHasBlock {
244-
// ledger already has the block, no need to request this block from anyone.
245-
return true
244+
// ledger already has the block, no need to request this block.
245+
// only the agreement could have added this block into the ledger, catchup is complete
246+
s.log.Infof("fetchAndWrite(%d): the block is already in the ledger. The catchup is complete", r)
247+
return false
246248
}
247249
s.log.Debugf("fetchAndWrite(%v): Could not fetch: %v (attempt %d)", r, err, i)
248250
peerSelector.rankPeer(psp, peerRankDownloadFailed)
@@ -353,8 +355,10 @@ func (s *Service) fetchAndWrite(r basics.Round, prevFetchCompleteChan chan bool,
353355
s.log.Infof("fetchAndWrite(%d): no need to re-evaluate historical block", r)
354356
return true
355357
case ledgercore.BlockInLedgerError:
356-
s.log.Infof("fetchAndWrite(%d): block already in ledger", r)
357-
return true
358+
// the block was added to the ledger from elsewhere after fetching it here
359+
// only the agreement could have added this block into the ledger, catchup is complete
360+
s.log.Infof("fetchAndWrite(%d): after fetching the block, it is already in the ledger. The catchup is complete", r)
361+
return false
358362
case protocol.Error:
359363
if !s.protocolErrorLogged {
360364
logging.Base().Errorf("fetchAndWrite(%v): unrecoverable protocol error detected: %v", r, err)
@@ -387,7 +391,7 @@ func (s *Service) pipelineCallback(r basics.Round, thisFetchComplete chan bool,
387391
thisFetchComplete <- fetchResult
388392

389393
if !fetchResult {
390-
s.log.Infof("failed to fetch block %v", r)
394+
s.log.Infof("pipelineCallback(%d): did not fetch or write the block", r)
391395
return 0
392396
}
393397
return r

0 commit comments

Comments
 (0)