-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement invalidate_on_push without pushedDate field #602
Conversation
First step in removing our use of the deleted pushedDate field to implement invalidate_on_push. Load the last policy-bot status (if it exists) to approximate the evaluation time instead.
It looks like this works in most situations, but one problem case is if a force-push replaces only some of the commits in the PR. Right now, that moves the evaluation cutoff to the time of the last remaining commit, which can lead to incorrect approval. I'm not sure how to solve this yet, since the force-push removed the state that we'd otherwise use to detect the push. It looks like force-pushes are recorded in the pull request timeline, but we had some historic issues with that API not returning information fast enough and leading to incorrect results (#47, #54), so I'm a bit hesitant to use it again. |
After running in to various edge cases that didn't work in the previous approach, I switched to listing the statuses on the (filtered) head commit and using the creation time of the oldest status as the push time. For the most part this is simpler although it does require new API calls. The new calls can be cached and I think are balanced by the removal of all the This method leaves a bit of a gap during webhook delivery and processing where an approval could be incorrectly ignored, but I think this is acceptable and is better than incorrectly approving a PR that should require a new approval. This should only be an issue if either processing is very delayed or a user leaves an approval within seconds of a new push. I need to test the interactions with ignored commits a bit more, but otherwise, I think this is working. |
@@ -301,10 +301,6 @@ func (r *Rule) filterEditedCandidates(ctx context.Context, prctx pull.Context, c | |||
func (r *Rule) filterInvalidCandidates(ctx context.Context, prctx pull.Context, candidates []*common.Candidate) ([]*common.Candidate, []*common.Dismissal, error) { | |||
log := zerolog.Ctx(ctx) | |||
|
|||
if !r.Options.InvalidateOnPush { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already checked by the caller
@@ -31,7 +31,6 @@ UI to view the detailed approval status of any pull request. | |||
- [`or`, `and`, and `if` (Rule Predicates)](#or-and-and-if-rule-predicates) | |||
- [Cross-organization Membership Tests](#cross-organization-membership-tests) | |||
- [Update Merges](#update-merges) | |||
- [Private Repositories](#private-repositories) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section was removed a long time ago, but we didn't update the ToC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not still necessary to maintain a ToC by hand, at least for the github web UI: https://github.blog/changelog/2021-04-13-table-of-contents-support-in-markdown-files/
|
||
// sortCommits orders commits in history order starting from head. It must be | ||
// called on the unfiltered set of commits. | ||
func sortCommits(commits []*pull.Commit, head string) []*pull.Commit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unnecessary because GitHub already returns the commits in this order, but I'd rather be explicit than have this break strangely if the order changes for some reason - this ordering is important for push detection to work in all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding some small test cases for this ordering function?
membership map[string]bool | ||
statuses map[string]string | ||
labels []string | ||
pushedAt map[string]time.Time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could actually be cached globally: once we know the push time for a specific commit in a specific repository, it shouldn't change. I think for now it's easiest to re-compute it on each request, but if the extra requests become a problem, this might be a good optimization.
@@ -84,12 +83,3 @@ func setStringFromEnv(key, prefix string, value *string) bool { | |||
} | |||
return false | |||
} | |||
|
|||
func setBoolFromEnv(key, prefix string, value *bool) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't removing the support for global env also not a breaking change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, we never attempt to load the pushedDate
field, so the do_not_load_pushed_date
option has no effect. In the YAML configuration, the parser errors if the file contains unknown fields. But for environment variables, there's no issue with setting values that Policy Bot doesn't know about, so I think this should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for explaining @bluekeyes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea to use past statuses on commits seems like a really good workaround for the missing data problem. Its a bit annoying that we've lost data from GH, but I think this should work in most cases. I'm a bit worried about some race between the few seconds after a push and a new status showing up, but as you mentioned in practice I think this probably won't be an issue.
|
||
// sortCommits orders commits in history order starting from head. It must be | ||
// called on the unfiltered set of commits. | ||
func sortCommits(commits []*pull.Commit, head string) []*pull.Commit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding some small test cases for this ordering function?
👍 LGTM |
#612 adds a global cache for this data, since I'm worried about the cost of all the API calls to list statuses, particularly as a commit accumulates more statuses over time. I'll merge that separately, since this PR is already making enough changes. |
See: palantir/policy-bot#602 See: product-os/policy-bot#4 See: Change-type: patch Signed-off-by: Kyle Harding <kyle@balena.io>
See: palantir/policy-bot#602 See: product-os/policy-bot#4 See: product-os/environment-production#623 Change-type: patch Signed-off-by: Kyle Harding <kyle@balena.io>
This is a draft PR to explore how to implement the
invalidate_on_push
option without using thepushedDate
field that has been removed from the GraphQL API. The basic idea is to use the timestamps of past policy-bot statuses to figure out which comments/reviews/etc. are relevant for a particular evaluation:Right now, I've only changed the
pull.Context
and the logic described above, so it doesn't compile. I still need to finish connecting all the parts and then make sure everything works.Closes #598.