Skip to content

Do re-download corrupt objects with GVFS #782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 13, 2025

Conversation

dscho
Copy link
Member

@dscho dscho commented Aug 13, 2025

As of 9e59b38, Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted objects. But that's exactly what we want to do in VFS for Git. This is even tested for in the functional tests of VFS for Git, which has been identified as a regression in microsoft/VFSForGit#1853.

@dscho dscho requested a review from mjcheetham August 13, 2025 13:46
@dscho dscho self-assigned this Aug 13, 2025
dscho added a commit to microsoft/VFSForGit that referenced this pull request Aug 13, 2025
I am letting GitHub Actions build a test version for that PR as I am
writing this.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
dscho added a commit to microsoft/VFSForGit that referenced this pull request Aug 13, 2025
Let's let it run on the `tmp` branch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
dscho added a commit to microsoft/VFSForGit that referenced this pull request Aug 13, 2025
I am letting GitHub Actions build a test version for that PR as I am
writing this.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
derrickstolee
derrickstolee previously approved these changes Aug 13, 2025
Copy link

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Is there any way we can test this without a hop to microsoft/vfsforgit?

@dscho
Copy link
Member Author

dscho commented Aug 13, 2025

LGTM. Is there any way we can test this without a hop to microsoft/vfsforgit?

Hmm. Valid point. The sad news is that the VFS for Git test case was introduced in microsoft/VFSForGit@2db0c03, which is one of those absurdly big and lamentably under-documented commits that I frequently use as examples why code changes really need to be crafted with skill.

But I realize that I even failed to target the correct commit with the fixup! in this PR, as there simply is no commit that actually deals with corrupt loose objects. Will reconsider.

If the `read-object` hook is not found, the code currently would fail
with the rather obscure message:

	fatal: Out of memory, strdup failed

The reason is because without a check whether the hook was found,
eventually `strvec_push()` would try to call `xstrdup()` with `cmd`
(which is `NULL` if the hook was not found) which would fail and print
that rather misleading message.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the do-re-download-corrupt-objects-with-gvfs branch from fd34fe0 to cda6dd6 Compare August 13, 2025 19:29
@dscho
Copy link
Member Author

dscho commented Aug 13, 2025

@derrickstolee thank you for helping me be a better engineer. I thought that I had a really good reproducer, but I hadn't; My initial draft of a test case reused the corrupt_byte function of t1060, which replaces the 10th byte by a NUL, but that would not trigger the same code path as VFS for Git's previously-failing test case, and would actually not require any fix to let the read-object hook re-download the corrupt object. Then I tried to replace the loose object by a 0-sized file, but same thing: The code path was lenient enough to give the read-object hook a job and everything worked without the fix. In the end, I painstakingly imitated the VFS for Git approach, where the loose object file is truncated by 8 bytes. That did it.

This experience of trying to find a way to trigger the same code path that does require the fix, however, kind of makes me think that 9e59b38, despite the claims in its commit message, is doing an incomplete job of reporting corrupted objects. Oh well. Maybe some day, when I have some free time...

In the meantime, @derrickstolee, could I solicit another review, please?

@dscho
Copy link
Member Author

dscho commented Aug 13, 2025

Ach, darn it. macOS' sed strikes again.

As of 9e59b38 (object-file: emit corruption errors when detected,
2022-12-14), Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git via the
`read-object` hook, as per the `GitCorruptObjectTests` code
added in microsoft/VFSForGit@2db0c030eb25 (New
features: [...] -  GVFS can now recover from corrupted git object files
[...] , 2018-02-16).

So let's support precisely that, and add a regression test that ensures
that re-downloading corrupt objects via the `read-object` hook works.

While at it, avoid the XOR operator to flip the bits, when we actually
want to make sure that they are turned off: Use the AND-NOT operator for
that purpose.

Helped-by: Matthew John Cheetham <mjcheetham@outlook.com>
Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the do-re-download-corrupt-objects-with-gvfs branch from cda6dd6 to db6f8f4 Compare August 13, 2025 20:05
@dscho dscho marked this pull request as ready for review August 13, 2025 22:09
@dscho dscho merged commit b65c8e1 into vfs-2.50.1 Aug 13, 2025
180 of 182 checks passed
@dscho dscho deleted the do-re-download-corrupt-objects-with-gvfs branch August 13, 2025 22:09
dscho added a commit that referenced this pull request Aug 13, 2025
As of 9e59b38, Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git. This is
even tested for in the functional tests of VFS for Git, which has been
identified as a regression in
microsoft/VFSForGit#1853.
dscho added a commit that referenced this pull request Aug 13, 2025
As of 9e59b38, Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git. This is
even tested for in the functional tests of VFS for Git, which has been
identified as a regression in
microsoft/VFSForGit#1853.
dscho added a commit that referenced this pull request Aug 13, 2025
As of 9e59b38, Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git. This is
even tested for in the functional tests of VFS for Git, which has been
identified as a regression in
microsoft/VFSForGit#1853.
@dscho dscho mentioned this pull request Aug 13, 2025
dscho added a commit that referenced this pull request Aug 13, 2025
As of 9e59b38, Git will loudly complain about corrupt objects.

That is fine, as long as the idea isn't to re-download locally-corrupted
objects. But that's exactly what we want to do in VFS for Git. This is
even tested for in the functional tests of VFS for Git, which has been
identified as a regression in
microsoft/VFSForGit#1853.
@dscho dscho mentioned this pull request Aug 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants