-
Notifications
You must be signed in to change notification settings - Fork 105
Add -M/--find-renames option and blame.renames config to control rename detection #755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: vfs-2.49.0
Are you sure you want to change the base?
Conversation
Co-authored-by: dscho <127790+dscho@users.noreply.github.com>
git blame in a repository cloned via VFS for Git
dscho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we have to abandon the idea of -M because that already exists (as you found out) but already means something else (yay consistency of Git's user interface).
| -M[<num>]:: | ||
| --find-renames[=<num>]:: | ||
| Detect moved or copied lines within a file. When a commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you cannot do that. The existing -M option is not about whole-file rename detection, therefore we cannot reuse that, and my instructions regarding -M are moot.
You have to introduce a new --find-renames option instead.
blame.c
Outdated
| struct blame_origin *porigin = NULL; | ||
| struct diff_options diff_opts; | ||
| int i; | ||
| extern int rename_detection_mode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this won't work. You declared it as static in builtin/blame.c, which makes it file-local, and you cannot ever see it from this here file.
Instead, you have to introduce a new attribute in blame.h, probably in struct blame_scoreboard next to the xdl_opts (which are also diff-related).
blame.c
Outdated
| diff_opts.detect_rename = DIFF_DETECT_RENAME; | ||
| /* | ||
| * Use rename_detection_mode if specified, otherwise default to DIFF_DETECT_RENAME | ||
| * For mode values > 0 and < 100, use it as similarity threshold | ||
| */ | ||
| diff_opts.detect_rename = (rename_detection_mode == 0) ? 0 : | ||
| (rename_detection_mode > 0) ? | ||
| rename_detection_mode : DIFF_DETECT_RENAME; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only place where you actually use rename_detection_mode, which you described elsewhere as a percentage. However, it is used exclusively as a Boolean. You have to study diff*.c harder to see how the --find-renames value is used there and imitate it here.
t/t8015-blame-rename-detection.sh
Outdated
| . ./test-lib.sh | ||
|
|
||
| test_expect_success 'setup test file rename with content changes' ' | ||
| git init && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do study other t/t-[0-9]*-*blame*.sh scripts. git init is not necessary.
t/t8015-blame-rename-detection.sh
Outdated
| echo abc >1.txt && | ||
| echo def >>1.txt && | ||
| echo ghi >>1.txt && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use test_write_lines instead.
t/t8015-blame-rename-detection.sh
Outdated
| echo abc >1.txt && | ||
| echo def >>1.txt && | ||
| echo ghi >>1.txt && | ||
| git add . && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. is sloppy. Specify 1.txt explicitly.
| echo def >>1.txt && | ||
| echo ghi >>1.txt && | ||
| git add . && | ||
| git commit -m "Initial commit" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to prefix this with test_tick to make the OIDs reproducible.
|
|
||
| # This test confirms that by default, git blame follows partial-file renames | ||
| test_expect_success 'git blame follows inexact renames by default' ' | ||
| FIXED_1=$(git rev-parse --short HEAD^) && | ||
| FIXED_2=$(git rev-parse --short HEAD) && | ||
| git blame 2.txt >output && | ||
| grep "$FIXED_1" output | grep -q abc && | ||
| grep "$FIXED_2" output | grep -q 123 && | ||
| grep "$FIXED_1" output | grep -q ghi | ||
| ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too focused, and unnecessary. We really only need to verify that 1.txt is mentioned only when the rename-detection score is high enough.
t/t8015-blame-rename-detection.sh
Outdated
| # This test confirms that --no-find-renames or -M0 turns off rename detection | ||
| test_expect_success 'git blame can disable rename detection' ' | ||
| git blame --no-find-renames 2.txt >output && | ||
| ! grep -q 1.txt output | ||
| ' | ||
|
|
||
| # This test confirms that -M100 only follows exact renames | ||
| test_expect_success 'git blame can restrict to exact renames' ' | ||
| git blame -M100 2.txt >output && | ||
| ! grep -q 1.txt output | ||
| ' | ||
|
|
||
| # This test checks that blame.renames config works | ||
| test_expect_success 'blame.renames=false disables rename detection' ' | ||
| git -c blame.renames=false blame 2.txt >output && | ||
| ! grep -q 1.txt output | ||
| ' | ||
|
|
||
| # This test checks that -M with a score works | ||
| test_expect_success 'git blame with similarity score follows renames above threshold' ' | ||
| # Must follow 1.txt->2.txt rename for abc which are identical | ||
| git blame -M70 2.txt >output && | ||
| grep "$FIXED_1" output | grep -q abc && | ||
| # Should not follow for others below threshold | ||
| grep "$FIXED_2" output | grep -q 123 && | ||
| grep "$FIXED_2" output | grep -q ghi | ||
| ' | ||
|
|
||
| # This test checks that -M overrides blame.renames | ||
| test_expect_success '-M overrides blame.renames config' ' | ||
| # Using blame.renames=false but -M60 | ||
| git -c blame.renames=false blame -M60 2.txt >output && | ||
| grep "$FIXED_1" output | grep -q abc && | ||
| # The rest would be below 60% threshold | ||
| grep "$FIXED_2" output | grep -q 123 && | ||
| grep "$FIXED_2" output | grep -q ghi | ||
| ' | ||
|
|
||
| # This test checks that blame.renames with a score works | ||
| test_expect_success 'blame.renames with score controls rename threshold' ' | ||
| # Set threshold at 70%, abc is identical so above threshold | ||
| git -c blame.renames=70 blame 2.txt >output && | ||
| grep "$FIXED_1" output | grep -q abc && | ||
| # Other lines below threshold | ||
| grep "$FIXED_2" output | grep -q 123 && | ||
| grep "$FIXED_2" output | grep -q ghi | ||
| ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are waaaaay too many test cases for the job. All you need to do is to run git blame with various ways to specify the rename score, and then verify that 1.txt is either mentioned or not, depending on that score.
In fact, I highly suspect that this single test case that you need to add for this entire PR would find a much nicer home in one of the existing t/t*-blame*.sh scripts.
Co-authored-by: dscho <127790+dscho@users.noreply.github.com>
Co-authored-by: dscho <127790+dscho@users.noreply.github.com>
Co-authored-by: dscho <127790+dscho@users.noreply.github.com>
When running
git blameon large repositories, the automatic rename detection can cause performance issues because it unnecessarily compares blob contents to follow inexact renames. This PR adds options to control this behavior:Added new features:
Command line options to control rename detection in
git blame:-M[<n>]/--find-renames[=<n>]to specify the similarity threshold-M0or--no-find-renamesturns off rename detection completely-M100limits detection to only exact renames (identical blob content)New configuration option
blame.renamesthat can be set to:trueor1- follow renames with default similarity index (50%)falseor0- disable rename detection entirelycopy- detect copies as well as renamesDocumentation improvements:
git-blame.adoc-M/--find-renamesoption inblame-options.adocblame.renamesconfig inconfig/blame.adocTesting:
t/t8015-blame-rename-detection.shthat verifies the new functionalityExample:
Fixes #753.
Warning
Firewall rules blocked me from connecting to one or more addresses
I tried to connect to the following addresses, but was blocked by firewall rules:
cdn.fwupd.org/usr/bin/fwupdmgr refresh(dns block)If you need me to access, download, or install something from one of these locations, you can either:
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.