Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release builder] fix framework upgrade ordering (main) #14814

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

vgao1996
Copy link
Contributor

This fixes a bug in the release builder that results in the framework upgrade scripts being generated in reverse order.

Copy link

trunk-io bot commented Sep 30, 2024

⏱️ 31m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
rust-doc-tests 5m 🟩
execution-performance / test-target-determinator 5m 🟩
test-target-determinator 5m 🟩
rust-cargo-deny 4m 🟩🟩
check 4m 🟩
check-dynamic-deps 2m 🟩🟩
fetch-last-released-docker-image-tag 2m 🟩
rust-move-tests 2m 🟩
rust-move-tests 2m 🟩
general-lints 52s 🟩🟩
semgrep/ci 51s 🟩🟩
file_change_determinator 20s 🟩🟩
execution-performance / single-node-performance 11s 🟩
file_change_determinator 10s 🟩
permission-check 9s 🟩🟩

🚨 1 job on the last run was significantly faster/slower than expected

Job Duration vs 7d avg Delta
execution-performance / single-node-performance 11s 18m -99%

settingsfeedbackdocs ⋅ learn more about trunk.io

@vgao1996 vgao1996 changed the title [release builder] fix framework update ordering [release builder] fix framework upgrade ordering Sep 30, 2024
@vgao1996 vgao1996 changed the title [release builder] fix framework upgrade ordering [release builder] fix framework upgrade ordering (main) Sep 30, 2024
Copy link
Contributor

@georgemitenkov georgemitenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some unit tests for this?

@vgao1996
Copy link
Contributor Author

vgao1996 commented Oct 1, 2024

@georgemitenkov agreed. Lack of test coverage is exactly what resulted in this and some other bugs in the release builder in the first place. Although we'll need to put in some more thoughts to determine the exact form of tests -- for example, in this particular case, generating and then simulating the proposals would not have helped us catch this bug.

I've added test coverage as a high priority item to our release builder roadmap. This shall ship either before or at the same time as we introduce the other improvements we've planned (e.g. simulation before submission).

@vgao1996 vgao1996 enabled auto-merge (squash) October 1, 2024 15:55

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Oct 1, 2024

✅ Forge suite realistic_env_max_load success on 4da135961c57d3025bc26e3472819902e9b3e9c0

two traffics test: inner traffic : committed: 14425.76 txn/s, latency: 2750.96 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 3200 ms), latency samples: 5485060
two traffics test : committed: 100.09 txn/s, latency: 1553.41 ms, (p50: 1500 ms, p70: 1500, p90: 1600 ms, p99: 8000 ms), latency samples: 1700
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.249, avg: 0.225", "QsPosToProposal: max: 1.125, avg: 1.099", "ConsensusProposalToOrdered: max: 0.321, avg: 0.292", "ConsensusOrderedToCommit: max: 0.420, avg: 0.407", "ConsensusProposalToCommit: max: 0.709, avg: 0.698"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.94s no progress at version 2887012 (avg 0.20s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 8.69s no progress at version 2887010 (avg 8.69s) [limit 15].
Test Ok

Copy link
Contributor

github-actions bot commented Oct 1, 2024

✅ Forge suite framework_upgrade success on 7ef01a26f8d8a38610e3d364b722df517c970749 ==> 4da135961c57d3025bc26e3472819902e9b3e9c0

Compatibility test results for 7ef01a26f8d8a38610e3d364b722df517c970749 ==> 4da135961c57d3025bc26e3472819902e9b3e9c0 (PR)
Upgrade the nodes to version: 4da135961c57d3025bc26e3472819902e9b3e9c0
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1155.46 txn/s, submitted: 1158.80 txn/s, failed submission: 3.34 txn/s, expired: 3.34 txn/s, latency: 2618.64 ms, (p50: 2700 ms, p70: 3000, p90: 3600 ms, p99: 5500 ms), latency samples: 103820
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1193.83 txn/s, submitted: 1196.74 txn/s, failed submission: 2.91 txn/s, expired: 2.91 txn/s, latency: 2545.99 ms, (p50: 2400 ms, p70: 2700, p90: 3600 ms, p99: 5100 ms), latency samples: 106680
5. check swarm health
Compatibility test for 7ef01a26f8d8a38610e3d364b722df517c970749 ==> 4da135961c57d3025bc26e3472819902e9b3e9c0 passed
Upgrade the remaining nodes to version: 4da135961c57d3025bc26e3472819902e9b3e9c0
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1129.45 txn/s, submitted: 1132.10 txn/s, failed submission: 2.65 txn/s, expired: 2.65 txn/s, latency: 2634.02 ms, (p50: 2400 ms, p70: 2800, p90: 3900 ms, p99: 5900 ms), latency samples: 102220
Test Ok

Copy link
Contributor

github-actions bot commented Oct 1, 2024

✅ Forge suite compat success on 7ef01a26f8d8a38610e3d364b722df517c970749 ==> 4da135961c57d3025bc26e3472819902e9b3e9c0

Compatibility test results for 7ef01a26f8d8a38610e3d364b722df517c970749 ==> 4da135961c57d3025bc26e3472819902e9b3e9c0 (PR)
1. Check liveness of validators at old version: 7ef01a26f8d8a38610e3d364b722df517c970749
compatibility::simple-validator-upgrade::liveness-check : committed: 15994.77 txn/s, latency: 2114.58 ms, (p50: 2100 ms, p70: 2200, p90: 2400 ms, p99: 2600 ms), latency samples: 518580
2. Upgrading first Validator to new version: 4da135961c57d3025bc26e3472819902e9b3e9c0
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 6735.94 txn/s, latency: 4129.98 ms, (p50: 4600 ms, p70: 4600, p90: 5200 ms, p99: 5500 ms), latency samples: 138720
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 7262.45 txn/s, latency: 4417.88 ms, (p50: 4600 ms, p70: 4700, p90: 6400 ms, p99: 6600 ms), latency samples: 243240
3. Upgrading rest of first batch to new version: 4da135961c57d3025bc26e3472819902e9b3e9c0
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 7474.75 txn/s, latency: 3697.21 ms, (p50: 4100 ms, p70: 4400, p90: 4800 ms, p99: 5100 ms), latency samples: 132000
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 7507.71 txn/s, latency: 4263.11 ms, (p50: 4500 ms, p70: 4600, p90: 6000 ms, p99: 6400 ms), latency samples: 249600
4. upgrading second batch to new version: 4da135961c57d3025bc26e3472819902e9b3e9c0
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 10213.17 txn/s, latency: 2627.66 ms, (p50: 2500 ms, p70: 2900, p90: 4100 ms, p99: 4500 ms), latency samples: 176700
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 8973.46 txn/s, latency: 3450.17 ms, (p50: 2800 ms, p70: 4300, p90: 5900 ms, p99: 7300 ms), latency samples: 297440
5. check swarm health
Compatibility test for 7ef01a26f8d8a38610e3d364b722df517c970749 ==> 4da135961c57d3025bc26e3472819902e9b3e9c0 passed
Test Ok

@vgao1996 vgao1996 merged commit 0cf8d44 into aptos-labs:main Oct 1, 2024
94 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants