Skip to content

Conversation

@tooomm
Copy link
Contributor

@tooomm tooomm commented Mar 11, 2023

This splits actions/cache up in actions/cache/restore & actions/cache/save to ensure the cache is always saved, even when the image checks fail due to rate limiting or some broken links. The cache is still valuable.

This can also help with #204.
There, only the first 1k links are checked and we are then running into the API rate limiting which throws errors and the cache is not saved as a result to the failing job. Subsequent runs then start with the old cache again which does not contain the just correctly checked entries and starts from scratch failing at the same point. With this, following runs will start where the last check left off.

@ebbit1q
Copy link
Member

ebbit1q commented Mar 11, 2023

the cache only lasts for an hour, this wouldn't solve the problem, it'd mean the action now has to run twice to check the first thousand urls then fail and have to run again within an hour to check the left over 500, to solve the problem the action would have to enforce a rate limit.

@tooomm
Copy link
Contributor Author

tooomm commented Mar 11, 2023

See your PR #206 as an example, it fails and the valuable cache of 1500 successful link checks is not saved. This PR fixes this.
In general, isn't it just better to preserve the cache from this action, even when single links fail?

I never said it's solving the new API issue.
But as is, there is now way to check all links there. With this change, there is at least some way. In the future, the action will probably be capable to read rate limit headers and act accordingly.

Cache age is a topic on it's own. I'm debating to change it to 12h. Other suggestions?

@ebbit1q
Copy link
Member

ebbit1q commented Mar 11, 2023

ah, I misunderstood then, sorry about that. I agree that the cache persistence could be increased. though on the other hand the caching is more of a "nice to have" in my opinion, even without cache the action completes in seconds.

one consideration is that this would kind of work against #204 as it'd give a false representation of a run of the workflow when there is a current cache, completing successfully only then. we'd need some sort of way to trigger a cacheless run then, which right now is as simple as waiting an hour, increasing the time span would get in the way for that then...

@tooomm
Copy link
Contributor Author

tooomm commented Mar 12, 2023

though on the other hand the caching is more of a "nice to have" in my opinion, even without cache the action completes in seconds.

I don't think we use caching here to speed things up. It's to reduce the amount of requests we have to do repetitively.
We basically want to catch links that go kaput at one point, or new links containing typos etc.
For us it doesn't matter too much if the request is ok right now, or was 4h ago.

one consideration is that this would kind of work against #204 as it'd give a false representation of a run of the workflow when there is a current cache, completing successfully only then. we'd need some sort of way to trigger a cacheless run then, which right now is as simple as waiting an hour, increasing the time span would get in the way for that then...

I can see if there is a simple possibility to trigger cacheless runs.

But not sure if I understand the case you're lining out with 204 and what the issue is. Can you try to rephrase it maybe? Sorry 🙈

@ebbit1q
Copy link
Member

ebbit1q commented Mar 12, 2023

say we notice that a pr (eg 204) doesn't pass the tests, so we make a small change and push to see if this fixes it, now if the original cause was a rate limit ban the small change will seem to have fixed it as half the urls are in cache now. however if we were to then merge such a pr later, the test will fail again. this is what I mean with a false representation of a run of the workflow.

@tooomm
Copy link
Contributor Author

tooomm commented Mar 12, 2023

That is a good point, true!
I guess #209 circumvents this issue, as those links are not rate limited.

Copy link
Member

@ebbit1q ebbit1q left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no reason not to include this, testing ratelimit issues is not the purpose of this workflow

@ebbit1q ebbit1q merged commit 7fe56fb into master Mar 31, 2023
@ebbit1q ebbit1q deleted the tooomm-cache branch March 31, 2023 15:53
@tooomm tooomm changed the title Picture check: Split cache action up to always save cache CI: Split cache action up to always save cache Apr 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants