-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateful -> stateless migration preparation #65
Comments
Need to think about how and when in-memory trie will be enabled |
Here are the options for dealing with the memtrie launch during stateful -> stateless migration:
@bowenwang1996 I heard you're in favor of Option 1, could you confirm that is still the best option, after considering the other options available? Regardless of the option picked, we would need to repeatedly test this protocol upgrade. |
@robin-near yes I still think option 1 is the best. 2 and 3 are too complex and add not just additional engineering complexity but also testing burden. 4 and 5 would likely degrade performance and we cannot control whether there will be high load on mainnet when it happens, so it is the best to avoid performance degradation altogether. |
Based on today's discussion, unloading memtries will not be necessary since the validators will need to restart the nodes to downgrade the RAM size. |
Unloading memtrie is done: near/nearcore#11657. |
It seems we have "shadow tracking" already implemented: https://github.com/near/nearcore/blob/master/core/chain-configs/src/client_config.rs#L412 |
I think this may track the shard where the account id is located, not the shard that the validator with this account id would track. Either way this is a good find and definitely related, perhaps we can reuse some of it for our purpose? Also the name is way less catchy than shadow tracking ;) |
2024-07-01 (Monday) Update
|
Part of: near/near-one-project-tracking#65 An option for non-validator node to track shards of given validator. During stateful -> stateless protocol upgrade a node will track all shards and will require a lot of RAM. After the migration we can move the validator key to a new, smaller node, that does not track all shards. To make it with minimal downtime, the new node needs to have appropriate shards in place and memtries loaded in memory, then we hot swap the validator key without stopping the new node. But before that happen the new node is not a validator and we need a way to tell it which validator's shards it should track.
Part of: near/near-one-project-tracking#65 An option for non-validator node to track shards of given validator. During stateful -> stateless protocol upgrade a node will track all shards and will require a lot of RAM. After the migration we can move the validator key to a new, smaller node, that does not track all shards. To make it with minimal downtime, the new node needs to have appropriate shards in place and memtries loaded in memory, then we hot swap the validator key without stopping the new node. But before that happen the new node is not a validator and we need a way to tell it which validator's shards it should track.
Part of: near/near-one-project-tracking#65 An option for non-validator node to track shards of given validator. During stateful -> stateless protocol upgrade a node will track all shards and will require a lot of RAM. After the migration we can move the validator key to a new, smaller node, that does not track all shards. To make it with minimal downtime, the new node needs to have appropriate shards in place and memtries loaded in memory, then we hot swap the validator key without stopping the new node. But before that happen the new node is not a validator and we need a way to tell it which validator's shards it should track.
Identify the right way to enable memtrie during the stateful-to-stateless validation protocol upgrade.
Before the upgrade all nodes will track all shards and they will start tracking only one or some shards after the upgrade. Memtrie is expected to be enabled in the case of tracking single or some shards, as it requires more memory. However, we may also need to think about how to make the transition from (stateful + disk tries) to (stateless + memtries). During this, we may need to run nodes with memtries while tracking all shards. Assuming this is the path to follow, we need to follow-ups.
Profile the memory usage needed to do that (a similar task opened for RPC nodes tracking all nodes:
Profile memory usage requirements of RPC and archival nodes in stateless validation nearcore#11230)
Make sure after the protocol upgrade the memtries for the shards that are not tracked in the new epoch are unloaded, leaving the nodes only with memtries for their tracked shards.
Related thread here.
The text was updated successfully, but these errors were encountered: