Closed
Description
Most (all) of the nix-build jobs are being cancelled in progress since the quotas have changed. Adjust the workflows to fit in the new limits.
Context: since #6243 the ci jobs are grouped by refs and cancelled together. The existing "Nix CI" job wasn't prepared for this for two reasons:
- It builds many variants of
llama.cpp
in a single job. - It only pushes the results to cachix after all of the builds have ended (not sure if it does the push in the "destructor" step after the cancellation).
- PRs from forks don't have access to the repo secrets so they don't push to cachix. However, it's plausible that these could make up the majority of all jobs?
- We're running pure nix-builds, meaning we can only cache store paths (results of complete and successful builds) not e.g. intermediate object files. This provides a strong guarantee that a passing CI means the build can be reproduced locally, but this also limits how much we can reuse between the CI jobs
References:
- Make IQ1_M work for QK_K = 64 #6327 (comment)
- ci: apply concurrency limit for github workflows #6243
Potential solutions
- Make
onPush
builds (.#checks
) less pure- ccacheStdenv
- check-pointing
- Run pure builds
onSchedule
instead
- More granular jobs: generate individual github jobs for individual attributes
Questions
- How effective is the caching right now?
- PRs from forks aren't allowed to push to cachix