-
Notifications
You must be signed in to change notification settings - Fork 670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: extract state sync logic from Chain #12833
Conversation
829de4c
to
1064482
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #12833 +/- ##
==========================================
+ Coverage 70.37% 70.38% +0.01%
==========================================
Files 848 850 +2
Lines 174870 174908 +38
Branches 174870 174908 +38
==========================================
+ Hits 123064 123113 +49
+ Misses 46547 46540 -7
+ Partials 5259 5255 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
.chain_store_update | ||
.get_block_header_on_chain_by_height(&sync_hash, chunk.height_included())?; | ||
let block_header = get_block_header_on_chain_by_height( | ||
&self.chain_store_update.chain_store(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this part a no-op? So before we were calling the ChainStoreAccess
trait method on the ChainStoreUpdate<'a>
, and it looks like that one checks the in memory stuff before going to the disk with . chain_store
, but after this change we will skip any of the in-memory changes. Maybe it's fine if I'm not really sure if this function won't ever be called with relevant uncommitted changes, not sure. But if you meant for this part to be a no-op, maybe you could just have get_block_header_on_chain_by_height()
take a T: ChainStoreAccess
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... I have short comment below about that, but it seems better to copy-paste it here as well :)
So, this method uses only block headers, and 1) they are already synced 2) save_block_header is never called here, so this is a no-op.
As for ChainStoreAccess
- this is the part of some larger refactoring:
The existing design is to abstract away ChainStoreAccess
to handle both ChainStore
and ChainStoreUpdate
.
While idea makes sense, in reality we don't seem to get much from it. The hypothesis is that originally ChainStoreUpdate
was made to commit many blocks to access temporary updates. But in reality, we almost always commit one block at a time, and we can fix it if we don't. So all the complexity ChainStoreUpdate
has can likely be removed. I learned yesterday that @shreyan-gupta is testing this idea in #12162.
That's why I lean towards reducing usages to ChainStoreAccess
:) If you want, I can make this a separate PR, because that's indeed a serious change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also say that the "set_state_finalize" moves forward from current height to sync_hash
while get_block_header_on_chain_by_height
moves backwards, starting from sync_hash
, so there is no chance to reuse saved data.
self.chain_store_update.get_block_header_on_chain_by_height(&sync_hash, height); | ||
// Note that block headers are already synced and can be taken | ||
// from store on disk. | ||
let block_header_result = get_block_header_on_chain_by_height( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above for this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thanks for the change!
Main result: number of LoC in
chain/chain/src/chain.rs
goes from 4690 to 4125.Please let me know if you want me to split this in multiple PRs. I can do this but I'm very tempted to just go forward because there are no functional changes...
Biggest part (568+ lines)
It is moving state sync-related methods from
chain/chain/src/chain.rs
tochain/chain/src/state_sync/adapter.rs
. This is achieved by creatingChainStateSyncAdapter
which needs access to much less data thanChain
.Another change (156+ lines)
To achieve the previous one, I wanted to read chain data from
ChainStoreAdapter
, notChainStore
. We are currently refactoringChainStore
and I didn't want to create new instances of it. We can allow read-only access to data which already exist in storage.For this, in turn, I needed to make
get_chunk_clone_from_header
,get_block_header_on_chain_by_height
andget_incoming_receipts_for_shard
require onlyChainStoreAdapter
access. I was able to move these tochain/chain/src/store/utils.rs
without changing logic.Another bit of context: these methods previously required
ChainStoreAccess
in order to be queried byChainStoreUpdate
. But as @shreyan-gupta discovered, we don't actually need that, at least in almost all places of the code. So it was natural to me to simplify access to these functions already.Bonus
It was very easy to move
get_state_sync_info
toShardTracker
andStateRequestTracker
to new state sync folder - a nice example of hiding implementation details.