Skip to content

Optimize CI #1310

Open
Open

Description

As of this writing, our CI tests (specified in .github/workflows/ci.yml) take ~5.5m to run end-to-end during PR development and ~19.5m to run end-to-end in the merge queue. This significantly affects developer velocity, especially when developing a sequence of features which stack (ie, one PR needs to land before the next PR can be seriously considered).

This task tracks optimizing our end-to-end CI latency. Anything is on the table!

Note that both the PR latency and the merge queue latency are on the table. The PR latency is obviously the more important metric, since PR tests may run multiple times during PR development. However, given that GitHub has no automated way to merge a stack of PRs, we often have to actively keep an eye on the merge queue in order to know when we can kick off the next PR's merge. For this reason, merge queue latency is important as well.

Advice

As of this writing, we skip 5 out of 7 build targets and all Miri tests during PR development. Thus, the merge queue CI tests have somewhat different performance characteristics than PR CI tests.

In my own investigations, I've discovered the following:

  • In the merge queue, the bottleneck seems to be the build_test job, which encompasses the primary test matrix (there are other ancillary jobs such as kani, check_fmt, etc; these do not appear to be the bottleneck)
  • Among individual matrix jobs, the distribution of times appears to be highly bimodal:
    • Most matrix jobs take ~1-2m to complete
    • Some matrix jobs take ~13m to complete
    • What distinguishes the two appears to be Miri tests, which are run only in the latter (~13m) group
  • It also seems to take a few minutes just to spawn all of the ~200 jobs in the matrix (before they start executing)

We've already done some work to speed up Miri test execution (recently, #1307, #1308, and #1313). There is probably a lot more that could be done there.

There are probably also a lot of other optimization opportunities besides Miri; I just haven't taken the time to investigate in detail.

See also: #1312, #1314

Failed attempts

I tried these, but found no speedup, or wasn't able to get them working:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    experience-mediumThis issue is of medium difficulty, and requires some experiencegoogle-20%-projectPotential 20% project for a Google employeehelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions