Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler-v2] Extend existing v1/v2 comparison process description #12726

Merged
merged 2 commits into from
Oct 8, 2024

Conversation

wrwg
Copy link
Contributor

@wrwg wrwg commented Mar 28, 2024

Description

In tests/README.md we had described since a while how the process of porting v1 tests into the v2 tree works. Because of continued misalginment about the process and its motivation, this PR adds some more details to the README outlining the process of test comparison and why it cannot be done with a simple textdiff.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Other (specify)

How Has This Been Tested?

NA

Key Areas to Review

README.md

In `tests/README.md` we had described since a while how the process of porting v1 tests into the v2 tree works. Because of continued misalginment about the process and its motivation, this PR adds some more details to the README outlining the process of test comparison and why it cannot be done with a simple textdiff.
Copy link

trunk-io bot commented Mar 28, 2024

⏱️ 1h 20m total CI duration on this PR
Job Cumulative Duration Recent Runs
windows-build 27m 🟩
rust-move-unit-coverage 21m 🟩
rust-unit-tests 15m 🟥
rust-lints 7m 🟩
check 4m 🟩
general-lints 2m 🟩
check-dynamic-deps 2m 🟩
semgrep/ci 36s 🟩
file_change_determinator 15s 🟩
file_change_determinator 15s 🟩
file_change_determinator 13s 🟩
run-tests-main-branch 11s 🟩
permission-check 8s 🟩
permission-check 5s 🟩
rust-move-tests 4s 🟩
permission-check 4s 🟩
permission-check 2s 🟩

🚨 3 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
windows-build 27m 20m +33%
run-tests-main-branch 11s 4m -96%
rust-move-tests 4s 14m -100%

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link

codecov bot commented Mar 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 59.8%. Comparing base (4bed99d) to head (4784700).
Report is 4 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (4bed99d) and HEAD (4784700). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (4bed99d) HEAD (4784700)
2 1
Additional details and impacted files
@@             Coverage Diff             @@
##             main   #12726       +/-   ##
===========================================
- Coverage    71.4%    59.8%    -11.7%     
===========================================
  Files        2400      853     -1547     
  Lines      485208   208243   -276965     
===========================================
- Hits       346557   124530   -222027     
+ Misses     138651    83713    -54938     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines +17 to +18
- Any number of bytecode level checkers or transformers (currently `live-var` and `reference-safety`
and `visibility-checker`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Any number of bytecode level checkers or transformers (currently `live-var` and `reference-safety`
and `visibility-checker`)
- Several stackless bytecode level checkers or transformers (eg., `live-var`, `reference-safety`)

In order to migrate a test such that the tool can keep track of it, ensure that you place it in a similar named parent directory (anywhere in the v2 test tree). For example, for a test `move-check/x/y.move`, ensure the test can be found somewhere at `x/y.move` in the v2 tree.
Because of this it is expensive to do test comparison, and essential that we follow the migration
process as outlined above. Specifically, do _not_ bulk copy tests into the v2 tree without
manual auditing them, and do _not_ fork tests, even if they are modified, so the relation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so I am on the same page, what does "forking a test" mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it means: leaving the original test unchanged + adding a new copy which is modified. I had argued that we should also run the original test, to guarantee that we at least note some error in the cases where V1 has an error, although it may be a different error due to error shadowing (passes run in different orders, so V2 may show a different error first, then exit before the pass that would display the errors which V1 presents). Here Wolfgang is disagreeing with that approach.

Because of this it is expensive to do test comparison, and essential that we follow the migration
process as outlined above. Specifically, do _not_ bulk copy tests into the v2 tree without
manual auditing them, and do _not_ fork tests, even if they are modified, so the relation
between v1/v2 tests is maintained.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to mention:

- Review process for every PR must check that all test output checked to ensure that the user-visible
  outputs (e.g., error messages and transactional test behavior) do not change.
- As the final goal is to generate errors for programs, test inputs should not be changed again without
   clear  documentation of why this was done.  In particular, if semantics change, they must be
   documented.

Perhaps a file could be added to note changes in semantics?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Quirks doc may suffice for the short-term, but we need to communicate subtle semantics better.

@brmataptos brmataptos changed the title [compiler-v2] Extends existing v1/v2 commparsion process description [compiler-v2] Extend existing v1/v2 comparison process description Mar 28, 2024
Copy link
Contributor

This issue is stale because it has been open 45 days with no activity. Remove the stale label, comment or push a commit - otherwise this will be closed in 15 days.

@github-actions github-actions bot added the Stale label May 18, 2024
@github-actions github-actions bot closed this Jun 2, 2024
@brmataptos brmataptos reopened this Jun 2, 2024
@github-actions github-actions bot removed the Stale label Jun 3, 2024
@brmataptos brmataptos added the stale-exempt Prevents issues from being automatically marked and closed as stale label Jun 10, 2024
@wrwg wrwg enabled auto-merge (squash) October 8, 2024 04:00

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Oct 8, 2024

✅ Forge suite compat success on 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> 47847004f15790b092318f8084c38cbcb0678ce4

Compatibility test results for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> 47847004f15790b092318f8084c38cbcb0678ce4 (PR)
1. Check liveness of validators at old version: 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775
compatibility::simple-validator-upgrade::liveness-check : committed: 12080.22 txn/s, latency: 2408.58 ms, (p50: 2100 ms, p70: 2200, p90: 2600 ms, p99: 14200 ms), latency samples: 471460
2. Upgrading first Validator to new version: 47847004f15790b092318f8084c38cbcb0678ce4
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7658.88 txn/s, latency: 3635.14 ms, (p50: 4100 ms, p70: 4400, p90: 4700 ms, p99: 5000 ms), latency samples: 136320
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 7404.00 txn/s, latency: 4332.79 ms, (p50: 4500 ms, p70: 4600, p90: 6000 ms, p99: 6600 ms), latency samples: 245120
3. Upgrading rest of first batch to new version: 47847004f15790b092318f8084c38cbcb0678ce4
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 7284.73 txn/s, latency: 3824.25 ms, (p50: 4300 ms, p70: 4500, p90: 4600 ms, p99: 4700 ms), latency samples: 136840
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 7539.24 txn/s, latency: 4296.28 ms, (p50: 4500 ms, p70: 4600, p90: 4700 ms, p99: 4900 ms), latency samples: 252140
4. upgrading second batch to new version: 47847004f15790b092318f8084c38cbcb0678ce4
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 10000.21 txn/s, latency: 2789.21 ms, (p50: 2500 ms, p70: 3000, p90: 4700 ms, p99: 5800 ms), latency samples: 177580
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 11308.17 txn/s, latency: 2827.00 ms, (p50: 2700 ms, p70: 2800, p90: 3500 ms, p99: 5300 ms), latency samples: 369720
5. check swarm health
Compatibility test for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> 47847004f15790b092318f8084c38cbcb0678ce4 passed
Test Ok

Copy link
Contributor

github-actions bot commented Oct 8, 2024

✅ Forge suite realistic_env_max_load success on 47847004f15790b092318f8084c38cbcb0678ce4

two traffics test: inner traffic : committed: 12711.82 txn/s, latency: 3130.06 ms, (p50: 3000 ms, p70: 3300, p90: 3600 ms, p99: 4700 ms), latency samples: 4833360
two traffics test : committed: 100.01 txn/s, latency: 2763.22 ms, (p50: 2600 ms, p70: 3000, p90: 3700 ms, p99: 5800 ms), latency samples: 1820
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.256, avg: 0.230", "QsPosToProposal: max: 0.334, avg: 0.270", "ConsensusProposalToOrdered: max: 0.328, avg: 0.311", "ConsensusOrderedToCommit: max: 0.614, avg: 0.544", "ConsensusProposalToCommit: max: 0.929, avg: 0.855"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 1.04s no progress at version 2624462 (avg 0.22s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 7.91s no progress at version 2624460 (avg 7.91s) [limit 15].
Test Ok

@wrwg wrwg merged commit 67f7ee6 into main Oct 8, 2024
96 checks passed
@wrwg wrwg deleted the wrwg/doc-v1v2 branch October 8, 2024 04:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale-exempt Prevents issues from being automatically marked and closed as stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants