-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Fix basic static checks comparing same commit with itself #60194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The prek command was comparing `github.sha` with itself, which checked no files. Now uses `merge-base` to check all files changed in the PR against the base branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Fixes a bug in the basic-tests workflow where static checks were comparing the same commit (github.sha) with itself, resulting in no files being checked. The fix introduces a merge-base calculation to properly identify all changed files in a PR by comparing against the base branch.
Key changes:
- Implements merge-base calculation to find the common ancestor between PR and base branch
- Updates static check command to compare merge-base commit against HEAD instead of comparing
github.shawith itself - Increases fetch depth and adds fallback logic to handle cases where merge-base isn't found in shallow clones
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| git fetch origin "$BASE_BRANCH" --depth=50 | ||
|
|
||
| # Try to find merge-base, deepen if not found | ||
| MERGE_BASE=$(git merge-base "origin/$BASE_BRANCH" HEAD 2>/dev/null || echo "") | ||
|
|
||
| if [ -z "$MERGE_BASE" ]; then | ||
| echo "Merge-base not found with depth=50, fetching more history..." | ||
| git fetch --deepen=450 | ||
| git fetch origin "$BASE_BRANCH" --deepen=450 | ||
| MERGE_BASE=$(git merge-base "origin/$BASE_BRANCH" HEAD) | ||
| fi | ||
|
|
||
| echo "sha=${MERGE_BASE}" >> $GITHUB_OUTPUT | ||
| env: | ||
| BASE_BRANCH: ${{ github.base_ref || 'main' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| git fetch origin "$BASE_BRANCH" --depth=50 | |
| # Try to find merge-base, deepen if not found | |
| MERGE_BASE=$(git merge-base "origin/$BASE_BRANCH" HEAD 2>/dev/null || echo "") | |
| if [ -z "$MERGE_BASE" ]; then | |
| echo "Merge-base not found with depth=50, fetching more history..." | |
| git fetch --deepen=450 | |
| git fetch origin "$BASE_BRANCH" --deepen=450 | |
| MERGE_BASE=$(git merge-base "origin/$BASE_BRANCH" HEAD) | |
| fi | |
| echo "sha=${MERGE_BASE}" >> $GITHUB_OUTPUT | |
| env: | |
| BASE_BRANCH: ${{ github.base_ref || 'main' }} | |
| git fetch origin "$BASE_BRANCH" --depth=$FETCH_DEPTH | |
| # Try to find merge-base, deepen if not found | |
| MERGE_BASE=$(git merge-base "origin/$BASE_BRANCH" HEAD 2>/dev/null || echo "") | |
| if [ -z "$MERGE_BASE" ]; then | |
| echo "Merge-base not found with depth=$FETCH_DEPTH, fetching more history..." | |
| git fetch --deepen=$DEEPEN | |
| git fetch origin "$BASE_BRANCH" --deepen=$DEEPEN | |
| MERGE_BASE=$(git merge-base "origin/$BASE_BRANCH" HEAD) | |
| fi | |
| echo "sha=${MERGE_BASE}" >> $GITHUB_OUTPUT | |
| env: | |
| BASE_BRANCH: ${{ github.base_ref || 'main' }} | |
| FETCH_DEPTH: 50 | |
| DEEPEN: 450 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit, maybe we can make the values parameterised for both --depth and --deepen from env :)
amoghrajesh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, thats a strange one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange one .. I wonder when it happened and how it did not work (I am sure it worked before). But it can be done way simpler.
All you need it is to add missing ^ in --from-ref
At the moment you do this check - github.sha already contains merge commit - this is how pull-request workflow works.
Note that GITHUB_SHA for this event is the last merge commit of the pull request merge branch. If you want to get the commit ID for the last commit to the head branch of the pull request, use github.event.pull_request.head.sha instead.
|
Superseded by #60202 |
|
This is precisely why we needed |
|
I merged my fix and rebased #60168 to verify that it works (it should fail) |
|
Nice catch @jedcunningham BTW :) |
|
Yep. Nicely failed now https://github.com/apache/airflow/actions/runs/20776566467/job/59663769292?pr=60168 |
|
Nice, I'd considered just adding "^" but somehow I confused myself thinking that would only go 1 commit up - that if we had many commits on the branch it'd only look at the last one. I guess it must squash? Anyways, glad its working. edit:
Ah, a merge commit. Confusing since it has 2 parents. Apparently the "first" parent is the commit from the base branch, which is why this works. TIL. |
Correct. This is actually what we very heavily bank on (this why I knew it by heart). def get_changed_files(commit_ref: str | None) -> tuple[str, ...]:
if commit_ref is None:
return ()
cmd = [
"git",
"diff-tree",
"--no-commit-id",
"--name-only",
"-r",
commit_ref + "^",
commit_ref,
]
result = run_command(cmd, check=False, capture_output=True, text=True)
if result.returncode != 0:
get_console().print(
f"[warning] Error when running diff-tree command [/]\n{result.stdout}\n{result.stderr}"
)
return ()
changed_files = tuple(result.stdout.splitlines()) if result.stdout else ()
get_console().print("\n[info]Changed files:[/]\n")
get_console().print(changed_files)
get_console().print()
return changed_files |
|
BTW. There is also another interesting thing I learned with There is a very nice shortcut to not have to find
Spot the difference .... there are THREE dots in second case. And the ... is actually showing the changes in your branch from the And ... (pun intended) there are few other gotchas: https://darekkay.com/blog/git-commit-ranges/#git-diff For example Git is wild. |
|
So basically if you want to have prek to execute on "only your changes" from a branch you have PR to main with - do this (without running any of (we have it in our docs) |
The prek command was comparing
github.shawith itself, which checked no files. Now usesmerge-baseto check all files changed in the PR against the base branch.This ensures static checks are actually ran in PRs like #60168.