-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal/pull checks #5365
Proposal/pull checks #5365
Conversation
Added all reviewers from the pull request and changes PRs |
design/open/$PULLS.md
Outdated
@@ -0,0 +1,108 @@ | |||
# $PULLS Proposal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rename the name to be dolar_pulls.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, great to see how we're advancing on this!
I would really appreciate a feature comparison between
Examples:
- Reviews 🥳
- What happens when I delete a branch and create it elsewhere?
- Non-required actions (it is fine not to implement them immediately; I would however like to understand how they will fit in later on. For instance running a non-required action is a good way to gain confidence in an action.)
- In GH "pull PR" and "merge branch" are related but distinct operations. Here the only way to pull is to merge. Given that some merges succeed without happening, maybe we want to keep the distinction?
Anyway, N-E-A-T-!-:---)
design/open/$PULLS.md
Outdated
* `Source Branch` | ||
* `Source Ref` - The source branch head at the time of creation | ||
* `Destination Branch` | ||
* `Run ID` - The actions associated run id, may be empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a list of run IDs? More generally, what "run IDs" are "associated" with a PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only one runID per pull? if we allow list of runs we can support re-runs with the current head
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actions run ID - the ID of the actions run triggered by the on-pull
hook when creating the $PULLS instance.
Similarly to pull requests in GitHub - we don't really need to save the history of the previous failed runs - the only relevant one is the current.
design/open/$PULLS.md
Outdated
1. Check for conflicts between source and destination branch - if conflict is found, do not run actions and display an appropriate message | ||
2. Run actions - and associate RunID with instance | ||
|
||
This will return the $PULLS instance ID, that can be used later for querying and merging etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a PULL$ is associated with exactly one <src, dst> pair, why do I need another identifier? Is the intent to identify CLOSED PULL$?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we don't explicitly delete closed and merged $PULLS, we require an additional identifier
design/open/$PULLS.md
Outdated
|
||
GET `/api/v1/repositories/{repository}/branches/{sourceBranch}/$PULLS/{destinationBranch}/{id}` | ||
|
||
Will return the status, corresponding actions RunID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure having to specify branches helps if I anyway need to specify the ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, can we follow Github's lead here? For example, getting a PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API conforms to the structure we currently use in many other APIs. Specifically it is similar the merge API.
It also enables us to be more efficient when listing from the ref-store
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also think that the suggested API will have a problem supply repository level information on current $PULLS.
If a pull lives under source branch it means I need to query all branches to understand the list of current pulls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nopcoder I'm not sure it make sense to list all PRs in a repository (in contrary to GitHub) I think the more common use case is to list all PRs for a given branch - and in that sense this structure is more effecient
design/open/$PULLS.md
Outdated
Add an optional parameter to the Merge API for the $PULLS id. | ||
When performing a merge with a $PULLS id, perform the following preliminary checks: | ||
1. Verify source reference id with source branch head | ||
2. Verify actions run successful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this means that a run of an "action" needs to be associated with a particular commit digest. (This is always a good idea, of course!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually a PR in this case. I can have a pre-commit
& post-commit
actions that are different than my on-pull
triggering. The first 2 are associated with a commit, the later is associated to a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are both correct :)
design/open/$PULLS.md
Outdated
|
||
### Commit | ||
|
||
On commit, we will need to go over all the $PULLS where the given branch is the source branch and update them accordingly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also after a merge, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need Delete and Merged? on a new commit you will go over all pulls of the branch, but what if the commit is merged? you will still go over it? who should be responsible for the deletion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was leftover from a previous design which had a more complicated state-machine - removed.
@eden-ohana not sure what the comment refers to
design/open/$PULLS.md
Outdated
|
||
On creation the following occurs: | ||
|
||
1. Check for conflicts between source and destination branch - if conflict is found, do not run actions and display an appropriate message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How? using diff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff is a long operation, not sure we want to wait for it to create a pull
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all other operations are not time consuming, we can run Diff. We have a synchronous API for that.
We also don't have to run the full Diff - we can stop at the first conflict.
Another option is to run it as an action, i.e. don't hold the PR creation request hanging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't go into the implementation details, but as a rule - it will probably be more time consuming to run the actions and then realize there's a conflict than checking for conflicts and run the actions only after verifying there are none
design/open/$PULLS.md
Outdated
* `Source Branch` | ||
* `Source Ref` - The source branch head at the time of creation | ||
* `Destination Branch` | ||
* `Run ID` - The actions associated run id, may be empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only one runID per pull? if we allow list of runs we can support re-runs with the current head
design/open/$PULLS.md
Outdated
|
||
### Commit | ||
|
||
On commit, we will need to go over all the $PULLS where the given branch is the source branch and update them accordingly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need Delete and Merged? on a new commit you will go over all pulls of the branch, but what if the commit is merged? you will still go over it? who should be responsible for the deletion?
design/open/$PULLS.md
Outdated
### $PULLS Object | ||
|
||
$PULLS will be saved in the ref-store under the key: | ||
`$PULLS/<src-branch>/<dest-branch>/<$PULLS-id>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a pull request is merged we can create another one? it will still be in the ref store, maybe we need to add to the required changes to manage the pull state on the ref store
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once a $PULLS instance is merged it has reached the end of its life.
Creating a new $PULLS instance using the same source and destination, will trigger the actions on the current source head. There no connection between the two instances and both are saved in the ref-store with their state and ids.
design/open/$PULLS.md
Outdated
On creation the following occurs: | ||
|
||
1. Check for conflicts between source and destination branch - if conflict is found, do not run actions and display an appropriate message | ||
2. Run actions - and associate RunID with instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these the same actions we run when wanting to merge pull?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - merge actions run on the pre-merge
and post-merge
hooks .
These actions run on on-pull
hook which is triggered when creating a new $PULLS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting closer 💪
design/open/$PULLS.md
Outdated
## How it will work | ||
|
||
Introduce a new entity called $PULLS at the repository level. The $PULLS entity is identified by a source branch and destination branch. | ||
Each source-destination pair can have only a single `OPEN` $PULLS instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the harm in starting without this limitation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose was to save on resources, and prevent redundant execution of a possibly very long process (action runs)
I don't think implementing it is a big issue
design/open/$PULLS.md
Outdated
|
||
### Creating a new $PULLS instance | ||
|
||
POST `/api/v1/repositories/{repository}/branches/{sourceBranch}/$PULLS/{destinationBranch}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the source be a commit?
POST `/api/v1/repositories/{repository}/branches/{sourceBranch}/$PULLS/{destinationBranch}` | |
POST `/api/v1/repositories/{repository}/branches/{sourceRef}/$PULLS/{destinationBranch}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - I think we discussed it previously and agreed that similarly to pull request the operation is done on a source branch and not a specific reference
design/open/$PULLS.md
Outdated
|
||
On creation the following occurs: | ||
|
||
1. Check for conflicts between source and destination branch - if conflict is found, do not run actions and display an appropriate message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all other operations are not time consuming, we can run Diff. We have a synchronous API for that.
We also don't have to run the full Diff - we can stop at the first conflict.
Another option is to run it as an action, i.e. don't hold the PR creation request hanging.
design/open/$PULLS.md
Outdated
|
||
GET `/api/v1/repositories/{repository}/branches/{sourceBranch}/$PULLS/{destinationBranch}/{id}` | ||
|
||
Will return the status, corresponding actions RunID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, can we follow Github's lead here? For example, getting a PR
design/open/$PULLS.md
Outdated
|
||
### Re-run Actions | ||
|
||
In the event of failed actions run, it will be possible to re-run the job via the Actions view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"From the Actions view" suggests that all actions are now retryable. Is that what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is part of the required changes for actions - unless we implement "re-run" in the $PULLS context
design/open/$PULLS.md
Outdated
|
||
In the event of failed actions run, it will be possible to re-run the job via the Actions view | ||
TBD: How to update runID of $PULLS instance on re-run | ||
TBD: Re-run of `on-pull` should be in the context of the branch and not of the specific ref. This is in contrast to how we would like to re-run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the conflict
checking? Do we run that again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conflict checking is performed in 2 places:
- The first time you create a $PULLS instance - to save redundant execution of actions
- As part of the merge process - exactly how it works today
In case of merge to source - rebase - conflict resolution the source head will change and the user will be required to re-run the actions
design/open/$PULLS.md
Outdated
TBD: How to update runID of $PULLS instance on re-run | ||
TBD: Re-run of `on-pull` should be in the context of the branch and not of the specific ref. This is in contrast to how we would like to re-run | ||
all other types of hooks which should run on the same reference. | ||
**Optional solution:** Do not implement re-run in actions context, instead implement rerun on $PULLS which will in fact, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I haven't change my HEAD and I still want to retry? We don't have to support that at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a real use-case of re-running tests if they succeed.
If we want to initially support re-runs only at the $PULLS context, then we will be decide whatever policy we want to create a new actions run from the $PULLS instance
design/open/$PULLS.md
Outdated
|
||
### Commit | ||
|
||
On commit, we will need to go over all the $PULLS where the given branch is the source branch and update them accordingly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftovers from previous state machine - removing
@@ -0,0 +1,1377 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need the state machine?
Co-authored-by: itaiad200 <itaiad200@gmail.com>
@arielshaqed Thanks for the valuable feedback!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I really like suggestion 2, it seems much leaner. I think we might make it even leaner by separating concerns.
Once defined in a lakeFS action, it will be possible to trigger the action | ||
manually with a mandatory reference (branch/commit/tag).The lakeFS action | ||
will run on the provided reference and the results will be available in the | ||
just like any other action run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like manually-run actions. They are really useful for many things, such as manually exporting data or just re-running some test that is known to be flaky. But isn't this orthogonal to the branch protection rule?
In detail: Not sure I understand why I need to trigger an action run. Why can't it be run as a postcommit hook or whatever? The only difference I can see is that such an action is non-blocking for commits -- so we need to implement a nonblocking / "start-only-but-don't-wait" action type. I think we would want it in any case, and the implementation for manual-trigger will in any case need to include this. Can we separate the two and make manual-run a separate topic?
And then we don't even need to implement manual-run actions at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about post-commit
hooks being enough for this task. In fact, today we don't wait for post hooks to complete, i.e. they are asynchronous. BUT - I think that some of the context is missing from this page. You can find more context here. TLDR: The requirements are that users don't want long running processes to execute with every commit. Potentially these executions are costly and should start explicitly by the user:
No description provided.