Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(vm): Improve snapshot management in batch executor #2513

Merged

Conversation

slowli
Copy link
Contributor

@slowli slowli commented Jul 26, 2024

What ❔

  • Does not embed snapshots into each other, and instead use a linear workflow.
  • Discards an old snapshot, if any, when processing a new transaction (currently, old snapshots are seemingly never discarded unless the transaction is rolled back).

Why ❔

  • Linear snapshow workflow is easier to grasp and it could allow optimizations in the future.
  • Accumulating snapshots introduces an overhead, esp. in the new VM.

Checklist

  • PR title corresponds to the body of PR (we generate changelog entries from PRs).
  • Tests for the changes have been added / updated.
  • Documentation comments have been added / updated.
  • Code has been formatted via zk fmt and zk lint.

@slowli slowli marked this pull request as ready for review July 29, 2024 14:12
/// External callers must follow the following snapshot workflow:
///
/// - Each new snapshot created using `make_snapshot()` must be either popped or rolled back before creating the following snapshot.
/// OTOH, it's not required to call either of these methods by the end of VM execution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean? Seems to contradict the line above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the contradiction? We don't create a snapshot at the end of the VM execution, so it may have either 0 or 1 snapshot at the end.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not required to call either of these methods by the end of VM execution.

This confused me (and still does) but I can understand now that it means that you can have multiple TX after your snapshot and then rollback/forget.

core/lib/multivm/src/interface/traits/vm.rs Outdated Show resolved Hide resolved
/// - `pop_snapshot_no_rollback()` may be called spuriously, when no snapshot was created. It is a no-op in this case.
///
/// These rules guarantee that at each given moment, a VM instance has <=1 snapshot (unless the VM makes snapshots internally),
/// which may allow additional VM optimizations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a low amount of snapshots just avoids allocating a Vec for them. Zero snapshots allow forgetting history, which is important for memory use.

core/lib/multivm/src/versions/shadow.rs Outdated Show resolved Hide resolved
core/lib/state/src/rocksdb/mod.rs Outdated Show resolved Hide resolved
core/node/state_keeper/src/batch_executor/traits.rs Outdated Show resolved Hide resolved
@slowli slowli force-pushed the aov-pla-1000-improve-snapshot-management-in-batch-executor branch from bcec129 to b51e26d Compare July 30, 2024 18:05
@slowli
Copy link
Contributor Author

slowli commented Jul 30, 2024

I've decided to remove batch VM traits from the PR. They can be added separately, after the new VM is merged into main.

@slowli
Copy link
Contributor Author

slowli commented Jul 31, 2024

There are 2 somewhat suspicious failing CI jobs on this PR, but I don't see how they could be related to this PR:

  • First: Missing revert data for estimateGas during the upgrade test
  • Second: Missing revert data for estimateGas during server integration tests (specifically "ERC20 contract checks › Can perform a deposit with precalculated max value") + reuse of a nonce in another server integration test ("ERC20 contract checks › Should claim failed deposit").

Since the new VM isn't integrated into the API server, I don't see how the estimateGas error could be related to this PR or the new VM integration in general, but I may be missing something.

@slowli slowli requested review from joonazan and popzxc July 31, 2024 08:44
Copy link
Contributor

@joonazan joonazan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of merging if you are sure that the disabled tests can be fixed.

self.snapshots
.pop()
.expect("Snapshot should be created before rolling it back");
self.snapshots.pop();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is it necessary to pop a nonexistent snapshot?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It happens here on the first transaction, or after the previous transaction was rolled back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could just always pop the snapshot if it isn't rolled back in the same place where you may roll it back.

@slowli slowli merged commit 69be78c into jms-vm2 Jul 31, 2024
47 checks passed
@slowli slowli deleted the aov-pla-1000-improve-snapshot-management-in-batch-executor branch July 31, 2024 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants