-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: nodes keep out of sync when missing gossiped block (issue #284) #540
fix: nodes keep out of sync when missing gossiped block (issue #284) #540
Conversation
Fixing the following issue #284.
|
@srene tests fail |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #540 +/- ##
==========================================
- Coverage 52.46% 52.43% -0.04%
==========================================
Files 100 100
Lines 14717 14764 +47
==========================================
+ Hits 7722 7742 +20
- Misses 5886 5911 +25
- Partials 1109 1111 +2 ☔ View full report in Codecov by Sentry. |
@omri build successful with go1.19 after gossipsub downgrade |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO applyBlock should only applyBlock and it's being called from the sequencer or full nodes but adding caching logic there is not relevant for some of the callers (like the sequncer).
You can simply check in the ApplyBlockCallback
before even calling the applyBlock if it should be cached (by the store height etc) . If it’s correct apply it. If not add to cache. And than every time I get a block, assuming my cache is not empty I can try and call AttemptApplyCachedBlocks
(or some better name)
This function should probably be called in 2 situations:
- once we get a new gossiped block
- once we get a block from the DA
block/retriever.go
Outdated
@@ -50,19 +50,22 @@ func (m *Manager) syncUntilTarget(ctx context.Context, syncTarget uint64) error | |||
return err | |||
} | |||
|
|||
if settlementBatch.StartHeight != currentHeight+1 { | |||
//code commented out because it may be the case the start height from the da batch is not exactly the next block from the last received |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can simply delete the entire lines. no need for leaving it with comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
block/retriever.go
Outdated
if currentHeight != settlementBatch.EndHeight { | ||
|
||
//code commented out because it may be the case the current height is higher than da height after applying pending blocks, in case of missing blocks | ||
/*if currentHeight != settlementBatch.EndHeight { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can simply delete the entire lines. no need for leaving it with comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
block/block.go
Outdated
@@ -112,6 +113,30 @@ func (m *Manager) applyBlock(ctx context.Context, block *types.Block, commit *ty | |||
return nil | |||
} | |||
|
|||
func (m *Manager) checkPrevCachedBlocks(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to create a mutex on the this function as can be called from both DA code and gossiped blocks callback which may lead to a Race condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true. mutex added
block/block.go
Outdated
@@ -112,6 +113,30 @@ func (m *Manager) applyBlock(ctx context.Context, block *types.Block, commit *ty | |||
return nil | |||
} | |||
|
|||
func (m *Manager) checkPrevCachedBlocks(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as this function is not only checking it's also applying I think the name is confusing. Also when it says "Prev" blocks it can be confusing as it has been previously cached but those blocks can be in the future (i.e not yet applied)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name changed to attemptApplyCachedBlocks
block/manager.go
Outdated
m.logger.Debug("Failed to apply block", "err", err) | ||
|
||
if block.Header.Height != m.store.Height()+1 { | ||
// We crashed after the commit and before updating the store height. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is actually not accurate as we can cache blocks which just came out of order regardless to any crash.
block/manager.go
Outdated
|
||
if block.Header.Height != m.store.Height()+1 { | ||
// We crashed after the commit and before updating the store height. | ||
m.prevBlock[block.Header.Height] = &block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably try to apply the cached blocks even if we just applied a block because and not just if the blocks are out of order. for example, if we got previously block at (store height + 2) and put it into cache and now we got storeHeight + 1 (which means we won't go into the "if") we can still apply the cached block and no need to wait for the next out-of-order block in order to try it..
I would simply add at the end of the function a check, if the block we have in cache is == m.storeHeight + 1 then run ApplyCachedBlock
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true. having function call in the if condition may need an extra block to apply all cached blocks. it's been changed
block/manager.go
Outdated
m.logger.Debug("Caching block", "block height", block.Header.Height, "store height", m.store.Height()) | ||
err := m.checkPrevCachedBlocks(context.Background()) | ||
if err != nil { | ||
m.logger.Debug("Error applying previous cached blocks", "err", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure if Error
is correct here as it’s not necessarily due to error - it’s mostly based on attempting. So maybe “failed” . especially when you log it with a "debug"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log message changed
block/manager.go
Outdated
// We crashed after the commit and before updating the store height. | ||
m.prevBlock[block.Header.Height] = &block | ||
m.prevCommit[block.Header.Height] = &commit | ||
m.logger.Debug("Caching block", "block height", block.Header.Height, "store height", m.store.Height()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO makes more send to move the “caching block” message before the actual caching and not after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
block/manager.go
Outdated
m.prevBlock[block.Header.Height] = &block | ||
m.prevCommit[block.Header.Height] = &commit | ||
m.logger.Debug("Caching block", "block height", block.Header.Height, "store height", m.store.Height()) | ||
err := m.checkPrevCachedBlocks(context.Background()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why context.background() is being passed? seems like it’s not being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is actually used when calling applyBlock. there is no context in the manager so context.background() is used by now
PR Standards
Opening a pull request should be able to meet the following requirements
For Author:
godoc
commentsFor Reviewer:
After reviewer approval: