Skip to content

Update cache every run. Add /compiled and /logs. Make key sensitive to matrix. #71

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 47 commits into from
Nov 25, 2023

Conversation

IanButterworth
Copy link
Member

@IanButterworth IanButterworth commented Nov 19, 2023

Alternative to #45
Closes #11
Fixes #46

This keeps the depot in one cache file for the given key, which is sensitive to julia version, OS and arch (for the usual matrix var names, and there's a note if not). all matrix vars and the runner.OS.
It also re-saves the cache every run, removing the previous cache that was used.
By including /logs it also enables automatic Pkg.gc() to keep the depot size down, given that the depot will be restored>used>saved>restored>used>saved.. sequentially through the runs.

Regarding the split strategy in #45 , while there are some elements of the depot that are cross-OS compatible (/packages, /registries), for precompilation to remain hot, and for automatic Pkg.gc() to work properly, I think the depot should be cached in lock-step as a whole.

To implement this, because neither Github or actions/cache expose a way to update a cache file at a key, we have to run our own deletion just before the actions/cache save post step. So for a given key, there will only ever be one cache retained. This reduces the chance of the automatic eviction github does evicting a useful cache over an less useful one.

Demo

You can see it in action here JuliaIO/PNGFiles.jl#72
Tests have been disabled, so it's just installation and loading.

First commit was with no cache files in the repo.
Second empty commit benefitted from the caching.

On linux 1.11 goes from 2 minutes to 40s

Cache sizes are all around 30 MB.

And there will only ever be the same number as the permutations of the matrix, so 15 in that repo, which is a common number.

I will note that it seems there's been a sizeable regression in precompile duration in master recently. Needs investigating.

How to release

It'd be nice to get this widely adopted, so I'm intrigued about releasing this as a minor change.
There's no breakage to the API, except for changes to defaults.

Update README.md
Copy link
Member

@rikhuijzer rikhuijzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Makes sense to me. The cache deletion logic looks a bit like a "hack", but it makes sense to me. For Javascript libraries, not updating the cache should be fine for small dependency changes. However, for Julia small dependency changes can outdate almost all compiled artifacts, so that's probably why forcing a new cache each time makes sense here?

One question regarding the cache delete logic. Isn't the most recent cache also deleted? EDIT: Found it. The caches are deleted post job and then the new one is stored. Very nice.

@IanButterworth
Copy link
Member Author

isn't the most recent cache deleted?

No, the cache deletion step happens just before the cache/save step.

The way post jobs are queued is a little confusing.

IanButterworth and others added 2 commits November 19, 2023 14:50
Co-authored-by: Rik Huijzer <github@huijzer.xyz>
Co-authored-by: Rik Huijzer <github@huijzer.xyz>
@IanButterworth
Copy link
Member Author

Ideally we'd only save if there were changes in the depot, but in practice that will be true every run because of the usage logs being updated. And usage logs allow us to do auto Pkg.gc

It kind of puts the decision on how to manage the depot in Julia's hands, which I think makes sense.

@IanButterworth
Copy link
Member Author

IanButterworth commented Nov 19, 2023

Reminder to myself. We might want to add an input to turn off cache deletion. (Done)

@IanButterworth
Copy link
Member Author

If you're able to review @SaschaMann let me know, otherwise I'll merge and release this as v1.4.0 tomorrow. (or let me know if you want more time)

@IanButterworth IanButterworth force-pushed the ib/compiledv2 branch 3 times, most recently from 23031dc to 959b929 Compare November 22, 2023 01:49
@IanButterworth
Copy link
Member Author

I've done some tidying. @rikhuijzer would you mind taking one last look before I merge? This is in the context of it being a non-breaking release

@IanButterworth
Copy link
Member Author

Ok, some tweaks:

  • Added include-matrix input bool to allow the user to disable automatically including the matrix. While including the matrix is a good default for standard julia package CI, a use case I have for it is where a single runner runner runs, then a matrix of many different permutations, and I want the matrix to use the cache from the first runner. So in that case ignoring the matrix is beneficial
  • Pass the run repo explicitly to the gh cache calls as it cannot be assumed that the working directory is the repo of the run
  • Using an underscore after the restore key helps to indicate the end of the part that restore keys should match. Otherwise too many caches were deleted
  • Explicitly run Pkg.gc before save to help keep the cache size down
  • List the depot tree size after restore, for info

@IanButterworth IanButterworth marked this pull request as ready for review November 25, 2023 03:07
@IanButterworth IanButterworth force-pushed the ib/compiledv2 branch 2 times, most recently from 1f61106 to 4252ef7 Compare November 25, 2023 04:49
@IanButterworth
Copy link
Member Author

ugh.. turns out you can just ${{ join(matrix.*, '-') }} to get all matrix values joined with -.

So that avoids the need for jq 🎉

I'll go ahead and merge, and release tomorrow as 1.4.0 unless I hear otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Julia's version is not taken into account Cache precompilation cache
3 participants