Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add global cache for pushed times #612

Merged
merged 4 commits into from
Jul 26, 2023
Merged

Conversation

bluekeyes
Copy link
Member

This might be a premature optimization, but after testing, I'm worried that listing statuses on busy repositories will end up being too expensive, even with the HTTP caching. Since these values are safe to cache forever, add a simple in-memory LRU cache in each server instance to minimize API calls.

Also add a test for the PushedAt context function that verifies the caching behavior.

Base automatically changed from bkeyes/remove-pushed-date to develop July 24, 2023 21:42
bluekeyes and others added 2 commits July 24, 2023 14:43
This might be a premature optimization, but after testing, I'm worried
that listing statuses on busy repositories will end up being too
expensive, even with the HTTP caching. Since these values are safe to
cache forever, add a simple in-memory LRU cache in each server instance
to minimize API calls.

Also add a test for the PushedAt context function that verifies the
caching behavior.
pull/github.go Outdated
ghc.pushedAt[sha] = pushedAt
if gc := ghc.globalCache; gc != nil {
gc.SetPushedAt(repoID, sha, pushedAt)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to fix this so that it doesn't cache the zero time, since that indicates "no pushed date" and is not globally cacheable.

Note that we might end up with the same commit having two pushed times as a result: the first evaluation will use the evaluation time and then future evaluations will use the time of the first status. I'd like to cache the evaluation time instead, but still trying to figure out how to do that in a way that makes sense, since it interacts with the commit filtering implemented at a different level.

@bluekeyes
Copy link
Member Author

Note for myself on how to fix the caching when there are no statuses:

  1. Remove search and commit sorting logic from the approval evaluator
  2. Change PushedAt to always return a valid time
  3. In PushedAt, if the commit does not have any statuses, list all commits and then work towards the head until we find a status. If we hit the head commit with no status, use the evaluation timestamp.
  4. Record the discovered time as the push time for all SHAs evaluated in (3)

This scanning will only happen when there are ignored commits (otherwise, we're always starting with the head commit), so should be pretty rare and the global cache will also help reduce the expense.

This will also fix a bug when evaluating PRs that have a batch push ending in an ignored commit followed directly by another push containing only ignored commits.

This required a bigger change than expected because we want to cache the
evaluation timestamp when we use it. This meant moving the batch
checking in to the PushedAt function instead of handling it in the
approval logic. The new version is closer to how things used to work
when we loaded the pushedDate field.
@bluekeyes bluekeyes merged commit 2655fb0 into develop Jul 26, 2023
5 checks passed
@bluekeyes bluekeyes deleted the bkeyes/cache-pushed-date branch July 26, 2023 00:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants