[compiler-v2] Extend existing v1/v2 comparison process description #12726

wrwg · 2024-03-28T15:56:58Z

Description

In tests/README.md we had described since a while how the process of porting v1 tests into the v2 tree works. Because of continued misalginment about the process and its motivation, this PR adds some more details to the README outlining the process of test comparison and why it cannot be done with a simple textdiff.

Type of Change

Which Components or Systems Does This Change Impact?

How Has This Been Tested?

NA

Key Areas to Review

README.md

In `tests/README.md` we had described since a while how the process of porting v1 tests into the v2 tree works. Because of continued misalginment about the process and its motivation, this PR adds some more details to the README outlining the process of test comparison and why it cannot be done with a simple textdiff.

trunk-io · 2024-03-28T15:57:02Z

⏱️ 1h 20m total CI duration on this PR

Job	Cumulative Duration	Recent Runs
windows-build	27m	🟩
rust-move-unit-coverage	21m	🟩
rust-unit-tests	15m	🟥
rust-lints	7m	🟩
check	4m	🟩
general-lints	2m	🟩
check-dynamic-deps	2m	🟩
semgrep/ci	36s	🟩
file_change_determinator	15s	🟩
file_change_determinator	15s	🟩
file_change_determinator	13s	🟩
run-tests-main-branch	11s	🟩
permission-check	8s	🟩
permission-check	5s	🟩
rust-move-tests	4s	🟩
permission-check	4s	🟩
permission-check	2s	🟩

🚨 3 jobs on the last run were significantly faster/slower than expected

Job	Duration	vs 7d avg
windows-build	27m	20m
run-tests-main-branch	11s	4m
rust-move-tests	4s	14m

_{settings ⋅ feedback ⋅ docs ⋅ learn more about trunk.io}

codecov · 2024-03-28T16:17:32Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 59.8%. Comparing base (4bed99d) to head (4784700).
Report is 4 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (4bed99d) and HEAD (4784700). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (4bed99d) HEAD (4784700)

2 1

Additional details and impacted files

@@             Coverage Diff             @@
##             main   #12726       +/-   ##
===========================================
- Coverage    71.4%    59.8%    -11.7%     
===========================================
  Files        2400      853     -1547     
  Lines      485208   208243   -276965     
===========================================
- Hits       346557   124530   -222027     
+ Misses     138651    83713    -54938

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

vineethk · 2024-03-28T16:14:25Z

third_party/move/move-compiler-v2/tests/README.md

+- Any number of bytecode level checkers or transformers (currently `live-var` and `reference-safety`
+  and `visibility-checker`)


Suggested change

- Any number of bytecode level checkers or transformers (currently `live-var` and `reference-safety`

and `visibility-checker`)

- Several stackless bytecode level checkers or transformers (eg., `live-var`, `reference-safety`)

vineethk · 2024-03-28T16:17:49Z

third_party/move/move-compiler-v2/tests/README.md

-In order to migrate a test such that the tool can keep track of it, ensure that you place it in a similar named parent directory (anywhere in the v2 test tree). For example, for a test `move-check/x/y.move`, ensure the test can be found somewhere at `x/y.move` in the v2 tree.
+Because of this it is expensive to do test comparison, and essential that we follow the migration
+process as outlined above. Specifically, do _not_ bulk copy tests into the v2 tree without
+manual auditing them, and do _not_ fork tests, even if they are modified, so the relation


Just so I am on the same page, what does "forking a test" mean?

I think it means: leaving the original test unchanged + adding a new copy which is modified. I had argued that we should also run the original test, to guarantee that we at least note some error in the cases where V1 has an error, although it may be a different error due to error shadowing (passes run in different orders, so V2 may show a different error first, then exit before the pass that would display the errors which V1 presents). Here Wolfgang is disagreeing with that approach.

brmataptos · 2024-03-28T16:28:12Z

third_party/move/move-compiler-v2/tests/README.md

+Because of this it is expensive to do test comparison, and essential that we follow the migration
+process as outlined above. Specifically, do _not_ bulk copy tests into the v2 tree without
+manual auditing them, and do _not_ fork tests, even if they are modified, so the relation
+between v1/v2 tests is maintained.


You forgot to mention:

- Review process for every PR must check that all test output checked to ensure that the user-visible outputs (e.g., error messages and transactional test behavior) do not change. - As the final goal is to generate errors for programs, test inputs should not be changed again without clear documentation of why this was done. In particular, if semantics change, they must be documented.

Perhaps a file could be added to note changes in semantics?

The Quirks doc may suffice for the short-term, but we need to communicate subtle semantics better.

github-actions · 2024-05-18T01:48:28Z

This issue is stale because it has been open 45 days with no activity. Remove the stale label, comment or push a commit - otherwise this will be closed in 15 days.

github-actions · 2024-10-08T04:29:05Z

✅ Forge suite `compat` success on `46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775` ==> `47847004f15790b092318f8084c38cbcb0678ce4`

Compatibility test results for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> 47847004f15790b092318f8084c38cbcb0678ce4 (PR)
1. Check liveness of validators at old version: 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775
compatibility::simple-validator-upgrade::liveness-check : committed: 12080.22 txn/s, latency: 2408.58 ms, (p50: 2100 ms, p70: 2200, p90: 2600 ms, p99: 14200 ms), latency samples: 471460
2. Upgrading first Validator to new version: 47847004f15790b092318f8084c38cbcb0678ce4
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7658.88 txn/s, latency: 3635.14 ms, (p50: 4100 ms, p70: 4400, p90: 4700 ms, p99: 5000 ms), latency samples: 136320
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 7404.00 txn/s, latency: 4332.79 ms, (p50: 4500 ms, p70: 4600, p90: 6000 ms, p99: 6600 ms), latency samples: 245120
3. Upgrading rest of first batch to new version: 47847004f15790b092318f8084c38cbcb0678ce4
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 7284.73 txn/s, latency: 3824.25 ms, (p50: 4300 ms, p70: 4500, p90: 4600 ms, p99: 4700 ms), latency samples: 136840
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 7539.24 txn/s, latency: 4296.28 ms, (p50: 4500 ms, p70: 4600, p90: 4700 ms, p99: 4900 ms), latency samples: 252140
4. upgrading second batch to new version: 47847004f15790b092318f8084c38cbcb0678ce4
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 10000.21 txn/s, latency: 2789.21 ms, (p50: 2500 ms, p70: 3000, p90: 4700 ms, p99: 5800 ms), latency samples: 177580
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 11308.17 txn/s, latency: 2827.00 ms, (p50: 2700 ms, p70: 2800, p90: 3500 ms, p99: 5300 ms), latency samples: 369720
5. check swarm health
Compatibility test for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> 47847004f15790b092318f8084c38cbcb0678ce4 passed
Test Ok

github-actions · 2024-10-08T04:30:01Z

✅ Forge suite `realistic_env_max_load` success on `47847004f15790b092318f8084c38cbcb0678ce4`

two traffics test: inner traffic : committed: 12711.82 txn/s, latency: 3130.06 ms, (p50: 3000 ms, p70: 3300, p90: 3600 ms, p99: 4700 ms), latency samples: 4833360
two traffics test : committed: 100.01 txn/s, latency: 2763.22 ms, (p50: 2600 ms, p70: 3000, p90: 3700 ms, p99: 5800 ms), latency samples: 1820
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.256, avg: 0.230", "QsPosToProposal: max: 0.334, avg: 0.270", "ConsensusProposalToOrdered: max: 0.328, avg: 0.311", "ConsensusOrderedToCommit: max: 0.614, avg: 0.544", "ConsensusProposalToCommit: max: 0.929, avg: 0.855"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 1.04s no progress at version 2624462 (avg 0.22s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 7.91s no progress at version 2624460 (avg 7.91s) [limit 15].
Test Ok

wrwg requested review from vineethk, sausagee, rahxephon89, fEst1ck and brmataptos March 28, 2024 15:56

vineethk approved these changes Mar 28, 2024

View reviewed changes

brmataptos reviewed Mar 28, 2024

View reviewed changes

brmataptos changed the title ~~[compiler-v2] Extends existing v1/v2 commparsion process description~~ [compiler-v2] Extend existing v1/v2 comparison process description Mar 28, 2024

sausagee approved these changes Apr 2, 2024

View reviewed changes

github-actions bot added the Stale label May 18, 2024

github-actions bot closed this Jun 2, 2024

brmataptos reopened this Jun 2, 2024

github-actions bot removed the Stale label Jun 3, 2024

brmataptos added the stale-exempt Prevents issues from being automatically marked and closed as stale label Jun 10, 2024

Merge branch 'main' into wrwg/doc-v1v2

4784700

wrwg enabled auto-merge (squash) October 8, 2024 04:00

This comment has been minimized.

Sign in to view

wrwg merged commit 67f7ee6 into main Oct 8, 2024
96 checks passed

wrwg deleted the wrwg/doc-v1v2 branch October 8, 2024 04:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[compiler-v2] Extend existing v1/v2 comparison process description #12726

[compiler-v2] Extend existing v1/v2 comparison process description #12726

wrwg commented Mar 28, 2024

trunk-io bot commented Mar 28, 2024 •

edited

Loading

codecov bot commented Mar 28, 2024 •

edited

Loading

vineethk Mar 28, 2024

vineethk Mar 28, 2024

brmataptos Mar 28, 2024

brmataptos Mar 28, 2024

brmataptos Mar 28, 2024

github-actions bot commented May 18, 2024

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Oct 8, 2024

github-actions bot commented Oct 8, 2024

		- Any number of bytecode level checkers or transformers (currently `live-var` and `reference-safety`
		and `visibility-checker`)

	- Any number of bytecode level checkers or transformers (currently `live-var` and `reference-safety`
	and `visibility-checker`)
	- Several stackless bytecode level checkers or transformers (eg., `live-var`, `reference-safety`)

[compiler-v2] Extend existing v1/v2 comparison process description #12726

[compiler-v2] Extend existing v1/v2 comparison process description #12726

Conversation

wrwg commented Mar 28, 2024

Description

Type of Change

Which Components or Systems Does This Change Impact?

How Has This Been Tested?

Key Areas to Review

trunk-io bot commented Mar 28, 2024 • edited Loading

codecov bot commented Mar 28, 2024 • edited Loading

Codecov Report

vineethk Mar 28, 2024

Choose a reason for hiding this comment

vineethk Mar 28, 2024

Choose a reason for hiding this comment

brmataptos Mar 28, 2024

Choose a reason for hiding this comment

brmataptos Mar 28, 2024

Choose a reason for hiding this comment

brmataptos Mar 28, 2024

Choose a reason for hiding this comment

github-actions bot commented May 18, 2024

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Oct 8, 2024

✅ Forge suite compat success on 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> 47847004f15790b092318f8084c38cbcb0678ce4

github-actions bot commented Oct 8, 2024

✅ Forge suite realistic_env_max_load success on 47847004f15790b092318f8084c38cbcb0678ce4

trunk-io bot commented Mar 28, 2024 •

edited

Loading

codecov bot commented Mar 28, 2024 •

edited

Loading

✅ Forge suite `compat` success on `46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775` ==> `47847004f15790b092318f8084c38cbcb0678ce4`

✅ Forge suite `realistic_env_max_load` success on `47847004f15790b092318f8084c38cbcb0678ce4`