Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stateful -> stateless migration preparation #65

Closed
Tracked by #46
walnut-the-cat opened this issue Apr 10, 2024 · 9 comments
Closed
Tracked by #46

Stateful -> stateless migration preparation #65

walnut-the-cat opened this issue Apr 10, 2024 · 9 comments
Assignees

Comments

@walnut-the-cat
Copy link

walnut-the-cat commented Apr 10, 2024

Identify the right way to enable memtrie during the stateful-to-stateless validation protocol upgrade.
Before the upgrade all nodes will track all shards and they will start tracking only one or some shards after the upgrade. Memtrie is expected to be enabled in the case of tracking single or some shards, as it requires more memory. However, we may also need to think about how to make the transition from (stateful + disk tries) to (stateless + memtries). During this, we may need to run nodes with memtries while tracking all shards. Assuming this is the path to follow, we need to follow-ups.

  1. Profile the memory usage needed to do that (a similar task opened for RPC nodes tracking all nodes:
    Profile memory usage requirements of RPC and archival nodes in stateless validation nearcore#11230)

  2. Make sure after the protocol upgrade the memtries for the shards that are not tracked in the new epoch are unloaded, leaving the nodes only with memtries for their tracked shards.

Related thread here.

@walnut-the-cat
Copy link
Author

Need to think about how and when in-memory trie will be enabled

@robin-near
Copy link

robin-near commented May 7, 2024

Here are the options for dealing with the memtrie launch during stateful -> stateless migration:

  1. (All, then One) Enable memtries first, confirming that memtries would be loaded for all shards, and then after the protocol upgrade, one memtrie would remain in memory (for the assigned shard) while the other memtries would be unloaded. The downside for this approach is all stateless chunk producers need to have high memory instances temporarily before the upgrade.
  2. (Assigned Shard Only) Enable memtries first, but modifying it so that even in the stateful case, it would load only the shard that the node is assigned to (or if not validator, all shards). This makes it consistent with the stateless case. However, this may be difficult to implement. Suppose in the stateful case we have epochs E and E + 1, where in E we are assigned shard 1 and in E + 1 we're assigned shard 2. While in epoch E, we need to find some time to load shard 2 into memory for preparation of the next epoch, but since we're still tracking shard 2, the trie for shard 2 keeps changing. So we have two options:
    • (Concurrent Load) Modify the memtrie loading code so that it supports loading a "hot shard", i.e. during concurrent changes to the shard; this may or may not be easy but at least involves freezing the flat storage like when we take snapshots.
    • (Forced State Sync) Even though we track shard 2, we still force shard 2 to go through state sync, i.e. stop tracking the shard at the beginning of epoch E, pretend to do a state sync (but we already have the data so it's a noop), and then enter catchup. During catchup we will automatically first load the memtrie, and then will catch up until it's up to date, and then we would be tracking the shard again and everything continues normally. The problem here is that while we load for memtrie to load, this validator is not validating the shard, and that may become a security issue (though with 6 shards, is that really a problem?)
  3. (None, then One) Enable memtries first, but modify the logic to only load memtrie for the stateless protocol. The challenge here is similar to Option 2 but only exists for the very last epoch before the protocol upgrade, because in preparation for the first stateless epoch we need to load memtrie for one shard.
  4. (None, then None, then One) Enable memtries first, but modify the logic to only load memtrie since the second epoch after the protocol upgrade. This makes implementation easy, but the first epoch after the protocol upgrade will not have memtries and so would have degraded performance.
  5. (None, then Manual) Do not enable memtries before the protocol upgrade; ask node operators to enable memtries after the protocol upgrade has passed. This is like Option 4 except chunk producers will be degraded until they take action themselves.

@bowenwang1996 I heard you're in favor of Option 1, could you confirm that is still the best option, after considering the other options available?

Regardless of the option picked, we would need to repeatedly test this protocol upgrade.

@bowenwang1996
Copy link

@robin-near yes I still think option 1 is the best. 2 and 3 are too complex and add not just additional engineering complexity but also testing burden. 4 and 5 would likely degrade performance and we cannot control whether there will be high load on mainnet when it happens, so it is the best to avoid performance degradation altogether.

@tayfunelmas
Copy link

Based on today's discussion, unloading memtries will not be necessary since the validators will need to restart the nodes to downgrade the RAM size.

@staffik
Copy link

staffik commented Jun 25, 2024

Unloading memtrie is done: near/nearcore#11657.

@staffik
Copy link

staffik commented Jun 27, 2024

It seems we have "shadow tracking" already implemented: https://github.com/near/nearcore/blob/master/core/chain-configs/src/client_config.rs#L412
I will test how it works with memtries and validator key hot swap.
cc @tayfunelmas @wacban

@wacban
Copy link

wacban commented Jun 27, 2024

I think this may track the shard where the account id is located, not the shard that the validator with this account id would track. Either way this is a good find and definitely related, perhaps we can reuse some of it for our purpose?

Also the name is way less catchy than shadow tracking ;)

@staffik
Copy link

staffik commented Jul 1, 2024

2024-07-01 (Monday) Update

  • We did the migration twice, including RPC and split-storage archival nodes, and everything looks ok.
  • Validator key hot swap + shadow tracking works fine, near-zero missed chunks.
  • Described results in a doc.
  • What's left is we gonna see how it works with reduced network bandwidth.

github-merge-queue bot pushed a commit to near/nearcore that referenced this issue Jul 5, 2024
Part of: near/near-one-project-tracking#65
An option for non-validator node to track shards of given validator.

During stateful -> stateless protocol upgrade a node will track all
shards and will require a lot of RAM. After the migration we can move
the validator key to a new, smaller node, that does not track all
shards.
To make it with minimal downtime, the new node needs to have appropriate
shards in place and memtries loaded in memory, then we hot swap the
validator key without stopping the new node.
But before that happen the new node is not a validator and we need a way
to tell it which validator's shards it should track.
github-merge-queue bot pushed a commit to near/nearcore that referenced this issue Jul 5, 2024
Part of: near/near-one-project-tracking#65
An option for non-validator node to track shards of given validator.

During stateful -> stateless protocol upgrade a node will track all
shards and will require a lot of RAM. After the migration we can move
the validator key to a new, smaller node, that does not track all
shards.
To make it with minimal downtime, the new node needs to have appropriate
shards in place and memtries loaded in memory, then we hot swap the
validator key without stopping the new node.
But before that happen the new node is not a validator and we need a way
to tell it which validator's shards it should track.
VanBarbascu pushed a commit to near/nearcore that referenced this issue Jul 6, 2024
Part of: near/near-one-project-tracking#65
An option for non-validator node to track shards of given validator.

During stateful -> stateless protocol upgrade a node will track all
shards and will require a lot of RAM. After the migration we can move
the validator key to a new, smaller node, that does not track all
shards.
To make it with minimal downtime, the new node needs to have appropriate
shards in place and memtries loaded in memory, then we hot swap the
validator key without stopping the new node.
But before that happen the new node is not a validator and we need a way
to tell it which validator's shards it should track.
@staffik staffik closed this as completed Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants