Use a larger runner for tests #889

zanieb · 2024-01-11T19:18:18Z

Alternative to #875. Instead of partitioning tests across multiple runners via nextest, we use a larger GitHub Actions runner. Additionally, we explore using nextest to take advantage of the increased number of cores.

On the 8-core machine, nextest is 22% faster than insta. In combination with the vastly more readable output, I think this means we should switch over. As noted in #875 we lose the ability to detect unreferenced snapshot files but since we inline all of our snapshots this shouldn't matter.

Benchmarks

The following are the runtime of just the test portion of the test job in GitHub Actions except the partitioned case from #875 which requires a separate build step making runner overhead relevant.

The compile times are noted as a reference as a possible lower bound of test times. The compile time is included in all of the test times shown.

Where the nextest thread count is not noted, it is inferred from the CPU count.

test                                  time        diff
------------------------------------------------------
2-core (main)                         4m 53s
2-core-nextest-partioned (#875)       3m 56s      -19%
4-core-compilation                       32s      
4-core-insta                          1m 47s      -63%
4-core-nextest                        1m 40s      -66%
8-core-compilation                       18s      
8-core-insta                          1m  9s      -76%
8-core-nextest                        1m  5s      -78%
8-core-nextest-12-threads                54s      -82%
8-core-nextest-16-threads                55s      -82%

Cost

We must pay per-minute costs for these runners:

Larger runners are not eligible for the use of included minutes on private repositories. For both private and public repositories, when larger runners are in use, they will always be billed at the per-minute rate.

Compared to standard GitHub-hosted runners, larger runners are billed differently. Larger runners are only billed at the per-minute rate for the amount of time workflows are executed on them.

[source]

The per-minute rates are as follows:

Linux 2 $0.008 (main)
Linux 4 $0.016
Linux 8 $0.032 (pull request)

[source]

The per-minute cost increases by 4x but the workflow is 5.2x faster since we are making use of the extra compute. We will not get any free minutes executing these runners once the repository is public. Additionally, we will not make use of our 3,000 minutes / month of included minutes. Using the 8-core machines, the included 3,000 minutes should account for approximately ~$100.

Here's a brief analysis of costs from the last few


Minutes used
------------
November 1090 + 3000 = 4090
December 1357 + 3000 = 4357
January  2655 in 7 days
         ~3x more expected
                     = 11000 estimated

Costs
-----
November  1090 * 0.008   = $ 8.72
December  1357 * 0.008   = $10.86
January   8000 * 0.008   = $64 projected using 3000 included minutes and 2-core machines
          (11000 - (0.82 * 11000)) * 0.032
                         = $63 projected without included minutes and 4-core machines with perf improvement
          (11000 - (0.70 * 11000)) * 0.032
                         = $100 projected with a more conservative 70% reduction in total runtime

We can reduce costs (once public) by disabling larger runners for non-organization users e.g. PrefectHQ/prefect#9519

This reverts commit b11f1e9.

This reverts commit b673cae.

16 threads did not show benefit This reverts commit 2702ad6.

zanieb · 2024-01-11T20:08:55Z

.github/workflows/ci.yaml

+      # TODO(zanieb): The runner labels should be changed to include the OS and
+      #               templated here or moved to the matrix depending on how we
+      #               want things to display
+      labels: astral-8-core


Before merge, an organization admin (@charliermarsh) needs to change the name of this runner to ubuntu-latest-large-8 or similar.

The macOS runners are named "macos-latest-large" and "macos-latest-xlarge". We could consider just using "ubuntu-latest-large" for consistency in our workflow and change the number of cores in the admin panel as needed. A consistent name would make it really easy for us to have just the OS in the matrix.

charliermarsh · 2024-01-11T21:03:20Z

Since this only includes the test runtime, and not the compile time, we wouldn't expect an 82% speed overall right? Just looking at the cost analysis, which seems to assume one. (I still think it's fine.)

charliermarsh

What would you like the name change to be precisely

zanieb · 2024-01-11T21:09:55Z

@charliermarsh sorry that's not clear. All of the test run numbers include the compile time. I only listed the compile time separately to help see how much of the change was related to that. We should expect about an 82% improvement for compilation and testing. If we look at the total runtime of the job it's probably actually more like 70% overall runtime improvement (since setup isn't significantly faster) which comes out to ~$100 projected for January.

zanieb · 2024-01-11T21:11:27Z

ubuntu-latest-large please. Then I can match it to the macOS runner if we add tests for macOS.

charliermarsh · 2024-01-11T22:06:14Z

I renamed astral-8-core to ubuntu-latest-large

zanieb · 2024-01-11T22:14:15Z

Omg it's so fast. Note for posterity, there are larger runners e.g. 16, 32, and 64 core machines that I didn't bother testing. 1m seems good enough.

konstin · 2024-01-12T09:17:02Z

😍

zanieb added 4 commits January 11, 2024 13:18

Use a 4-core larger runner for tests

ab8395f

Use nextest test runner

b11f1e9

Use 8-core runner

06c0ab1

Restore insta test runner

b673cae

This reverts commit b11f1e9.

zanieb changed the title ~~Use a 4-core larger runner for tests~~ Use a larger runner for tests Jan 11, 2024

zanieb added 4 commits January 11, 2024 13:40

Restore nextest runner

d2a5299

This reverts commit b673cae.

Increase thread count for nextest to 12

9299917

Increase thread count to 16

2702ad6

Return to 12 threads

d881e14

16 threads did not show benefit This reverts commit 2702ad6.

zanieb commented Jan 11, 2024

View reviewed changes

zanieb marked this pull request as ready for review January 11, 2024 20:14

zanieb mentioned this pull request Jan 11, 2024

Use nextest test runner in CI #875

Closed

charliermarsh approved these changes Jan 11, 2024

View reviewed changes

Update name of larger runner

aef848a

zanieb merged commit 65c600b into main Jan 11, 2024
3 checks passed

zanieb deleted the zb/larger-runner branch January 11, 2024 22:14

zanieb mentioned this pull request Aug 23, 2024

Reduce CI time #5713

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a larger runner for tests #889

Use a larger runner for tests #889

zanieb commented Jan 11, 2024 •

edited

Loading

zanieb Jan 11, 2024 •

edited

Loading

charliermarsh commented Jan 11, 2024

charliermarsh left a comment

zanieb commented Jan 11, 2024 •

edited

Loading

zanieb commented Jan 11, 2024

charliermarsh commented Jan 11, 2024

zanieb commented Jan 11, 2024

konstin commented Jan 12, 2024

Use a larger runner for tests #889

Use a larger runner for tests #889

Conversation

zanieb commented Jan 11, 2024 • edited Loading

Benchmarks

Cost

zanieb Jan 11, 2024 • edited Loading

Choose a reason for hiding this comment

charliermarsh commented Jan 11, 2024

charliermarsh left a comment

Choose a reason for hiding this comment

zanieb commented Jan 11, 2024 • edited Loading

zanieb commented Jan 11, 2024

charliermarsh commented Jan 11, 2024

zanieb commented Jan 11, 2024

konstin commented Jan 12, 2024

zanieb commented Jan 11, 2024 •

edited

Loading

zanieb Jan 11, 2024 •

edited

Loading

zanieb commented Jan 11, 2024 •

edited

Loading