Skip to content

Rebase to v2.51.0 #785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 305 commits into
base: vfs-2.51.0
Choose a base branch
from
Open

Rebase to v2.51.0 #785

wants to merge 305 commits into from

Conversation

dscho
Copy link
Member

@dscho dscho commented Aug 19, 2025

The usual rebase to upstream.

jeffhostetler and others added 30 commits August 19, 2025 11:11
Teach status serialization to take an optional pathname on
the command line to direct that cache data be written there
rather than to stdout.  When used this way, normal status
results will still be written to stdout.

When no path is given, only binary serialization data is
written to stdout.

Usage:
    git status --serialize[=<path>]

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach status deserialize code to reject status cache
when printing in porcelain V2 and there are unresolved
conflicts in the cache file.  A follow-on task might
extend the cache format to include this additiona data.

See code for longer explanation.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
When using a virtual file system layer, the FSMonitor does not make
sense.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Changes to the global or repo-local excludes files can change the
results returned by "git status" for untracked files.  Therefore,
it is important that the exclude-file values used during serialization
are still current at the time of deserialization.

Teach "git status --serialize" to report metadata on the user's global
exclude file (which defaults to "$XDG_HOME/git/ignore") and for the
repo-local excludes file (which is in ".git/info/excludes").  Serialize
will record the pathnames and mtimes for these files in the serialization
header (next to the mtime data for the .git/index file).

Teach "git status --deserialize" to validate this new metadata.  If either
exclude file has changed since the serialization-cache-file was written,
then deserialize will reject the cache file and force a full/normal status
run.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
When sparse-checkout is enabled, add the sparse-checkout percentage to
the Trace2 data stream.  This number was already computed and printed
on the console in the "You are in a sparse checkout..." message.  It
would be helpful to log it too for performance monitoring.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Teach `git status --deserialize` to either wait indefintely
or immediately fail if the status serialization cache file
is stale.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add VFS checkout hydration percentage information to the default `git
status` output.  When VFS is enable, users will now see a "You are in
a partially-hydrated checkout with <percentage> of tracked files
present." message.

Upstream `git status` normally prints a "You are in a sparse checkout
with <percentage> of tracked files present."  This message was hidden
in `microsoft/git` when `core_virtualfilesystem` is set (because GVFS
users are always (and secretly) in a sparse checkout) and it was
thought that it would annoy users.

However, we now believe that it may be helpful for users to always see
the percentage and know when they are over-hyrdated, since
over-hyrdation can occur by accident and may greatly impact their Git
performance.  Knowing this value may help with GVFS support.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Add trace2 region around read_object_process to collect
time spent waiting for missing objects to be dynamically
fetched.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
With the "--untracked-files=complete" option status computes a
superset of the untracked files.  We use this when writing the
status cache.  If subsequent deserialize commands ask for either
the complete set or one of the "no", "normal", or "all" subsets,
it can still use the cache file because of filtering in the
deserialize parser.

When running status with the "-uno" option, the long format
status would print a "(use -u to show untracked files)" hint.

When deserializing with the "-uno" option and using a cache computed
with "-ucomplete", the "nothing to commit, working tree clean" message
would be printed instead of the hint.

It was easy to miss because the correct hint message was printed
if the cache was rejected for any reason (and status did the full
fallback).

The "struct wt_status des" structure was initialized with the
content of the status cache (and thus defaulted to "complete").
This change sets "des.show_untracked_files" to the requested
subset from the command-line or config.  This allows the long
format to print the hint.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2 region and data events describing attempts to deserialize
status data using a status cache.

A category:status, label:deserialize region is pushed around the
deserialize code.

Deserialization results when reading from a file are:
    category:status, path   = <path>
    category:status, polled = <number_of_attempts>
    category:status, result = "ok" | "reject"

When reading from STDIN are:
    category:status, path   = "STDIN"
    category:status, result = "ok" | "reject"

Status will fallback and run a normal status scan when a "reject"
is reported (unless "--deserialize-wait=fail").

If "ok" is reported, status was able to use the status cache and
avoid scanning the workdir.

Additionally, a cmd_mode is emitted for each step: collection,
deserialization, and serialization.  For example, if deserialization
is attempted and fails and status falls back to actually computing
the status, a cmd_mode message containing "deserialize" is issued
and then a cmd_mode for "collect" is issued.

Also, if deserialization fails, a data message containing the
rejection reason is emitted.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
When using fsmonitor the CE_FSMONITOR_VALID flag should be checked when
wanting to know if the entry has been updated. If the flag is set the
entry should be considered up to date and the same as if the CE_UPTODATE
is set.

In order to trust the CE_FSMONITOR_VALID flag, the fsmonitor data needs to
be refreshed when the fsmonitor bitmap is applied to the index in
tweak_fsmonitor. Since the fsmonitor data is kept up to date for every
command, some tests needed to be updated to take that into account.

istate->untracked->use_fsmonitor was set in tweak_fsmonitor when the
fsmonitor bitmap data was loaded and is now in refresh_fsmonitor since
that is being called in tweak_fsmonitor. refresh_fsmonitor will only be
called once and any other callers should be setting it when refreshing
the fsmonitor data so that code can use the fsmonitor data when checking
untracked files.

When writing the index, fsmonitor_last_update is used to determine if
the fsmonitor bitmap should be created and the extension data written to
the index. When running through unpack-trees this is not copied to the
result index. This makes the next time a git command is ran do all the
work of lstating all files to determine what is clean since all entries
in the index are marked as dirty since there wasn't any fsmonitor data
saved in the index extension.

Copying the fsmonitor_last_update to the result index will cause the
extension data for fsmonitor to be in the index for the next git command
to use.

Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Add trace information around status serialization.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
The fsmonitor script that can be used for running all the git tests
using watchman was causing some of the tests to fail because it wrote
to stderr and created some files for debugging purposes.

Add a new debug script to use with debugging and modify the other script
to remove the code that would cause tests to fail.

Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Report virtual filesystem summary data.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Disable deserialization when verbose output requested.

Verbose mode causes Git to print diffs for modified files.
This requires the index to be loaded to have the currently
staged OID values.  Without loading the index, verbose output
make it look like everything was deleted.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Create trace2_initialize_clock() and call from main() to capture
process start time in isolation and before other sub-systems are
ready.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Verify that `git status --deserialize=x -v` does not crash and
generates the same output as a normal (scanning) status command.

These issues are described in the previous 2 commits.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach Git to not throw a fatal error when an explicitly-specified
status-cache file (`git status --deserialize=<foo>`) could not be
found or opened for reading and silently fallback to a traditional
scan.

This matches the behavior when the status-cache file is implicitly
given via a config setting.

Note: the current version causes a test to start failing. Mark this as
an expected result for now.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2_thread_start() and trace2_thread_exit() events to the worker
threads used to read the index.  This gives per-thread perf data.

These workers were introduced in:
abb4bb8 read-cache: load cache extensions on a worker thread
77ff112 read-cache: load cache entries on worker threads

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
…ension

Add regions around code to read and write the cache-tree extension
when the index is read or written.

This is an experiment and may be dropped in future releases if
we don't need it anymore.

This experiment demonstrates that it takes more time to parse and
deserialize the cache-tree extension than it does to read the
cache-entries.

Commits [1] and [2] spreads cache-entry reading across N-1 cores
and dedicates a single core to simultaneously read the index extensions.

Local testing (on my machine) shows that reading the cache-tree extension
takes ~0.28 seconds.  The 11 cache-entry threads take ~0.08 seconds.
The main thread is blocked for 0.15 to 0.20 seconds waiting for the
extension thread to finish.

Let's use this commit to gather some telemetry and confirm this.

My point is that improvements, such as index V5 which makes the
cache entries smaller, may improve performance, but the gains may
be limited because of this extension.  And that we may need to
look inside the cache-tree extension to truly improve do_read_index()
performance.

[1] abb4bb8 read-cache: load cache extensions on a worker thread
[2] 77ff112 read-cache: load cache entries on worker threads

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
…and report_tracking()

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach subprocess_start() to use a copy of the passed `cmd` string
rather than borrowing the buffer from the caller.

Some callers of subprocess_start() pass the value returned from
find_hook() which points to a static buffer and therefore is only
good until the next call to find_hook().  This could cause problems
for the long-running background processes managed by sub-process.c
where later calls to subprocess_find_entry() to get an existing
process will fail.  This could cause more than 1 long-running
process to be created.

TODO Need to confirm, but if only read_object_hook() uses
TODO subprocess_start() in this manner, we could drop this
TODO commit when we drop support for read_object_hook().

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add data for the number of files created/overwritten and deleted during the checkout.

Give proper category name to all events in unpack-trees.c and eliminate "exp".

This is modified slightly from the original version due to interactions with 26f924d
(unpack-trees: exit check_updates() early if updates are not wanted, 2020-01-07).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
dscho and others added 18 commits August 19, 2025 11:13
In some instances, CodeQL's web UI on github.com leaves questions
unanswered. For example, in some alerts it is really necessary to follow
the entire "taint flow" to understand why something might be an issue.

The alerts for the `cpp/uncontrolled-allocation-size` rule, for example,
are all false positives, and only when inspecting the exact flow does it
become obvious that one alert wants to point out that the size of a
binary patch hunk, which is specified in the patch, is then used to
determine how much memory to allocate, which may potentially run out of
memory (and is hence just Git doing what it is asked to, and does not
need to be changed).

To help with those issues, publish the `.sarif` file as part of every
workflow run; This allows downloading that file and inspecting it e.g.
with the SARIF viewer extension in VS Code (for details, see
https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As pointed out by CodeQL, `lookup_commit()` can return NULL.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A couple of CodeQL's queries are opinionated in a way that is obviously
not shared by Git's source code's state, and apparently intentionally so.

For example, the "For loop variable changed in body" query as well as
the "No trivial switch statements" one result in too many results that
are apparently intentional in Git's source code. Let's not worry about
those, then. Also, Git has plenty of instances where variables shadow
other variables.

Other valid yet not quite critical issues identified by CodeQL include
complex conditionals and nested switch statements spanning several
pages.

We probably want to address these issues at some stage, but they are not
as critical as other problems pointed out by CodeQL, so let's silence
those queries for now and take care of them at a later stage.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The code is a bit too hard to reason about for CodeQL to figure out
whether the `fill_commit_graph_info()`  function is at all called after
`write_commit_graph()` returns (and hence whether `topo_levels` goes out
of context before it is used again).

The Git project insists that this is correct (and does not want to make
the code more obviously correct), so let's silence CodeQL's complaints
in this instance.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…ray past end

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…oes NUL-terminate correctly

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Let's exclude GitWeb from being scanned; It is not distributed by us.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Some fixes in `clar`, pointed out by CodeQL.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A few places where CodeQL thinks that variables might be uninitialized.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
These patches implement some defensive programming to address complaints
some static analyzers might have.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL pointed out a couple of issues, which are addressed in this patch
series.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch series has been long in the making, ever since Johannes
Nicolai and myself spiked this in November/December 2020.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Linux, the following command would cause the terminal to be stuck
waiting:

```
  git fetch origin foobar
```

The issue would be that the fetch would fail with the error

```
  fatal: couldn't find remote ref foobar
```

but the underlying `git-gvfs-helper` process wouldn't die. The
`subprocess_exit_handler()` method would close its stdin and stdout, but
that wouldn't be enough to cause the process to end.

This PR addresses that by skipping the `finish_command()` call of the
`clean_on_exit_handler` and instead lets `cleanup_children()` send a
SIGTERM to terminate those spawned child processes.
This patch series has been long in the making, ever since Johannes
Nicolai and myself spiked this in November/December 2020.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch series has been long in the making, ever since Johannes
Nicolai and myself spiked this in November/December 2020.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho requested a review from mjcheetham August 19, 2025 11:26
@dscho dscho self-assigned this Aug 19, 2025
@dscho
Copy link
Member Author

dscho commented Aug 19, 2025

Range-diff relative to v2.51.0-rc2
  • 5: 3974ce2 = 1: 18eea60 survey: calculate more stats on refs

  • 1: 8b4bd9c = 2: dfe0ee9 sparse-index.c: fix use of index hashes in expand_index

  • 2: 38e14d1 = 3: 21a5d1b t5300: confirm failure of git index-pack when non-idx suffix requested

  • 6: 641fdc3 = 4: ef96657 survey: show some commits/trees/blobs histograms

  • 3: 1de9f74 = 5: 10f0c5c t: remove advice from some tests

  • 4: 0f9eef3 = 6: d3385b3 t1092: add test for untracked files and directories

  • 13: b54ec9e = 7: 2f60fbf index-pack: disable rev-index if index file has non .idx suffix

  • 14: d7362c2 = 8: a4b44ec trace2: prefetch value of GIT_TRACE2_DST_DEBUG at startup

  • 7: 2986f76 = 9: f35b85a survey: add vector of largest objects for various scaling dimensions

  • 8: d82fad6 = 10: bfa16aa survey: add pathname of blob or tree to large_item_vec

  • 9: 68813ce = 11: 9414af4 survey: add commit-oid to large_item detail

  • 10: 5756588 = 12: 38640c0 survey: add commit name-rev lookup to each large_item

  • 11: 462ff89 = 13: bd95a48 survey: add --no-name-rev option

  • 12: 612228b = 14: 20fa940 survey: started TODO list at bottom of source file

  • 15: fe19ac2 = 15: fba1805 survey: expanded TODO list at the bottom of the source file

  • 16: 5f0c35e = 16: d524ace survey: expanded TODO with more notes

  • 17: 81cd49d = 17: bed25b7 reset --stdin: trim carriage return from the paths

  • 18: 71a3bf2 ! 18: 07c54e4 Identify microsoft/git via a distinct version suffix

    @@ Commit message
      ## GIT-VERSION-GEN ##
     @@
      
    - DEF_VER=v2.51.0-rc2
    + DEF_VER=v2.51.0
      
     +# Identify microsoft/git via a distinct version suffix
     +DEF_VER=$DEF_VER.vfs.0.0
  • 19: 89457cb = 19: d21a7ae gvfs: ensure that the version is based on a GVFS tag

  • 20: 5754f5f = 20: 2727b1e gvfs: add a GVFS-specific header file

  • 21: 22054fe = 21: 3e015ba gvfs: add the core.gvfs config setting

  • 22: 7382a1e = 22: 0e658cc gvfs: add the feature to skip writing the index' SHA-1

  • 23: 772275b = 23: a9a7ec0 gvfs: add the feature that blobs may be missing

  • 24: 9a816bd = 24: 8e517ed gvfs: prevent files to be deleted outside the sparse checkout

  • 25: 6a7635b = 25: 1886a0b gvfs: optionally skip reachability checks/upload pack during fetch

  • 26: 8501b2a = 26: 90b0a86 gvfs: ensure all filters and EOL conversions are blocked

  • 27: 2ef3797 = 27: a51059a gvfs: allow "virtualizing" objects

  • 28: 0133ef0 = 28: d6270d1 Hydrate missing loose objects in check_and_freshen()

  • 29: 7f6a9eb = 29: 8b09da4 sha1_file: when writing objects, skip the read_object_hook

  • 30: e3322cd = 30: 1e02db7 gvfs: add global command pre and post hook procs

  • 31: 35ee0b0 = 31: c4c97f6 t0400: verify that the hook is called correctly from a subdirectory

  • 32: 5669f47 = 32: f6b0e59 t0400: verify core.hooksPath is respected by pre-command

  • 33: 876d37e = 33: d606ac4 Pass PID of git process to hooks.

  • 34: a2b28cc = 34: 74cf0b9 sparse-checkout: make sure to update files with a modify/delete conflict

  • 35: 96092f2 = 35: 87cb969 worktree: allow in Scalar repositories

  • 36: 97c946e = 36: 69c2423 sparse-checkout: avoid writing entries with the skip-worktree bit

  • 37: 0bf1a82 = 37: bed1601 Do not remove files outside the sparse-checkout

  • 38: 93a3194 = 38: be218e4 send-pack: do not check for sha1 file when GVFS_MISSING_OK set

  • 245: 052ec52 = 39: 2035f90 gvfs: allow corrupt objects to be re-downloaded

  • 39: 7dd116c = 40: 596cc62 cache-tree: remove use of strbuf_addf in update_one

  • 40: edc66bd = 41: 0436a94 gvfs: block unsupported commands when running in a GVFS repo

  • 41: e39a08d = 42: a158d69 gvfs: allow overriding core.gvfs

  • 42: cf09f69 = 43: f55187b BRANCHES.md: Add explanation of branches and using forks

  • 43: 44ca9f5 = 44: ece93c2 Add virtual file system settings and hook proc

  • 44: 30f6cbc = 45: a43ce15 virtualfilesystem: don't run the virtual file system hook if the index has been redirected

  • 45: 5517767 = 46: ae4bf18 virtualfilesystem: check if directory is included

  • 46: 63dab0c = 47: 7cb493d backwards-compatibility: support the post-indexchanged hook

  • 47: 8cf9d7d = 48: fd3fa17 gvfs: verify that the built-in FSMonitor is disabled

  • 48: 7d49f89 = 49: ae207a6 wt-status: add trace2 data for sparse-checkout percentage

  • 49: 8b30048 = 50: 77a05c8 wt-status: add VFS hydration percentage to normal git status output

  • 50: 7861b3e = 51: 065f36f status: add status serialization mechanism

  • 51: 5d2e33d = 52: be6080d Teach ahead-behind and serialized status to play nicely together

  • 52: 4f4d2d9 = 53: 5d7f1c8 status: serialize to path

  • 53: 67e3219 = 54: c89c8a3 status: reject deserialize in V2 and conflicts

  • 54: f5732c1 = 55: ae20a35 serialize-status: serialize global and repo-local exclude file metadata

  • 55: d4beb1a = 56: 85f8b88 status: deserialization wait

  • 56: bd46fe9 = 57: 283754f status: deserialize with -uno does not print correct hint

  • 57: 8ba75bd = 58: b854957 fsmonitor: check CE_FSMONITOR_VALID in ce_uptodate

  • 58: ab0ecaa = 59: bc9d170 fsmonitor: add script for debugging and update script for tests

  • 59: 851c3a0 = 60: b317deb status: disable deserialize when verbose output requested.

  • 60: 38aa84d = 61: c748348 t7524: add test for verbose status deserialzation

  • 61: 8f9cecf = 62: da00afe deserialize-status: silently fallback if we cannot read cache file

  • 62: 729d7f7 = 63: e3eb163 gvfs:trace2:data: add trace2 tracing around read_object_process

  • 63: ab37149 = 64: 6271e7d gvfs:trace2:data: status deserialization information

  • 64: 30d321b = 65: f2f8e77 gvfs:trace2:data: status serialization

  • 65: fe9ced6 = 66: 7f640d8 gvfs:trace2:data: add vfs stats

  • 66: bd52253 = 67: 816d456 trace2: refactor setting process starting time

  • 67: 7445d9e = 68: 5765d2d trace2:gvfs:experiment: clear_ce_flags_1

  • 68: af777b7 = 69: 12606fb trace2:gvfs:experiment: report_tracking

  • 69: efac35b = 70: 2b62302 trace2:gvfs:experiment: read_cache: annotate thread usage in read-cache

  • 70: 52e5ad1 = 71: 4998f1e trace2:gvfs:experiment: read-cache: time read/write of cache-tree extension

  • 71: ab9c14c = 72: ecb63bb trace2:gvfs:experiment: add region to apply_virtualfilesystem()

  • 72: e056268 = 73: 19bd4ef trace2:gvfs:experiment: add region around unpack_trees()

  • 73: b505857 = 74: 037de62 trace2:gvfs:experiment: add region to cache_tree_fully_valid()

  • 74: 0f262e3 = 75: b7dfb3e trace2:gvfs:experiment: add unpack_entry() counter to unpack_trees() and report_tracking()

  • 75: 225f8df = 76: 16ed1ac trace2:gvfs:experiment: increase default event depth for unpack-tree data

  • 76: 18e399d = 77: fb4f5ff trace2:gvfs:experiment: add data for check_updates() in unpack_trees()

  • 77: 59fca4e = 78: f60deec Trace2:gvfs:experiment: capture more 'tracking' details

  • 78: 734f1e6 = 79: fba6d9c credential: set trace2_child_class for credential manager children

  • 79: 0ceba58 = 80: e64d4da sub-process: do not borrow cmd pointer from caller

  • 80: 363276e = 81: 0e5b181 sub-process: add subprocess_start_argv()

  • 81: 7f0faa3 = 82: 12f7afa sha1-file: add function to update existing loose object cache

  • 82: 828de71 = 83: c92a2c6 packfile: add install_packed_git_and_mru()

  • 83: 56398c5 = 84: 07ccbe9 index-pack: avoid immediate object fetch while parsing packfile

  • 84: 243d6e3 = 85: 87c5076 gvfs-helper: create tool to fetch objects using the GVFS Protocol

  • 85: 80a308e = 86: da78e18 sha1-file: create shared-cache directory if it doesn't exist

  • 86: 8876640 = 87: a195263 gvfs-helper: better handling of network errors

  • 87: 36c3311 = 88: 8a883d3 gvfs-helper-client: properly update loose cache with fetched OID

  • 88: fd54c29 = 89: cbb83f0 gvfs-helper: V2 robust retry and throttling

  • 89: ab8f7b4 = 90: 2388db8 gvfs-helper: expose gvfs/objects GET and POST semantics

  • 90: b504cd0 = 91: 79f28e6 gvfs-helper: dramatically reduce progress noise

  • 91: a6da5f2 = 92: 218ea02 gvfs-helper: handle pack-file after single POST request

  • 92: ea3fc12 = 93: a55dcac test-gvfs-prococol, t5799: tests for gvfs-helper

  • 93: a07636a = 94: 0b60864 gvfs-helper: move result-list construction into install functions

  • 94: 157e38c = 95: 925542d t5799: add support for POST to return either a loose object or packfile

  • 95: b975a53 = 96: 6773657 t5799: cleanup wc-l and grep-c lines

  • 96: f018ea4 = 97: 9b3c664 gvfs-helper: verify loose objects after write

  • 97: cec4767 = 98: 71616cb t7599: create corrupt blob test

  • 98: 27156ef = 99: 43e9855 gvfs-helper: add prefetch support

  • 99: c66a5ab = 100: 4c2bf79 gvfs-helper: add prefetch .keep file for last packfile

  • 100: a0d1006 = 101: a9aff75 gvfs-helper: do one read in my_copy_fd_len_tail()

  • 101: 1670522 = 102: 8ddd012 gvfs-helper: move content-type warning for prefetch packs

  • 102: fb034dd = 103: e5ad80a fetch: use gvfs-helper prefetch under config

  • 103: a45314c = 104: 90a8c49 gvfs-helper: better support for concurrent packfile fetches

  • 104: cf2a766 = 105: 4b943d3 remote-curl: do not call fetch-pack when using gvfs-helper

  • 105: 4b8cb96 = 106: 72e3ce8 fetch: reprepare packs before checking connectivity

  • 106: 9b1c584 = 107: 29e6969 gvfs-helper: retry when creating temp files

  • 107: 9822cea = 108: bf00f92 sparse: avoid warnings about known cURL issues in gvfs-helper.c

  • 108: 9343ed4 = 109: 9e9eed1 gvfs-helper: add --max-retries to prefetch verb

  • 109: ad9990f = 110: 03fcf76 t5799: add tests to detect corrupt pack/idx files in prefetch

  • 110: 970a0e2 = 111: fef0017 gvfs-helper: ignore .idx files in prefetch multi-part responses

  • 111: c44bf7b = 112: fdf5b90 t5799: explicitly test gvfs-helper --fallback and --no-fallback

  • 112: efc34dc = 113: 5ea8127 gvfs-helper: don't fallback with new config

  • 113: 7f56223 = 114: 53dbe72 test-gvfs-protocol: add cache_http_503 to mayhem

  • 114: a9ee22a = 115: d65eea4 t5799: add unit tests for new gvfs.fallback config setting

  • 115: 80289e8 = 116: d9b2dcd maintenance: care about gvfs.sharedCache config

  • 116: 54c97e5 = 117: e29d04e unpack-trees:virtualfilesystem: Improve efficiency of clear_ce_flags

  • 117: c233db8 = 118: 7eb56c8 homebrew: add GitHub workflow to release Cask

  • 118: 799ed9a = 119: 0f8e14f Adding winget workflows

  • 119: c958fc3 = 120: 57124c3 Disable the monitor-components workflow in msft-git

  • 120: e29cca2 = 121: 0431002 .github: enable windows builds on microsoft fork

  • 121: 1c53898 = 122: 6b8ae45 .github/actions/akv-secret: add action to get secrets

  • 122: 47ca597 = 123: 5ac788d release: create initial Windows installer build workflow

  • 123: 4f9437f = 124: 85a2780 release: create initial Windows installer build workflow

  • 124: 688df84 = 125: a0e80d0 help: special-case HOST_CPU universal

  • 125: 91a92e4 = 126: ddc2bca release: add Mac OSX installer build

  • 126: 293ae1f = 127: 13f92e0 release: build unsigned Ubuntu .deb package

  • 127: ae9dae2 = 128: e8b80df release: add signing step for .deb package

  • 128: 78d5c15 = 129: 7dcec35 release: create draft GitHub release with packages & installers

  • 129: a04f42d = 130: 73168c4 build-git-installers: publish gpg public key

  • 130: 0a32c05 = 131: 4d9a7d8 release: continue pestering until user upgrades

  • 131: f52896a = 132: 9308b72 dist: archive HEAD instead of HEAD^{tree}

  • 137: b38a9eb = 133: 8ee6938 release: include GIT_BUILT_FROM_COMMIT in MacOS build

  • 139: cfb1016 = 134: 531fc95 release: add installer validation

  • 132: 31cff74 = 135: 3ede924 update-microsoft-git: create barebones builtin

  • 133: ad45ba5 = 136: 9272521 update-microsoft-git: Windows implementation

  • 134: 5c78ce3 = 137: 5bebc82 update-microsoft-git: use brew on macOS

  • 135: 9d0a18b = 138: dd6be5c .github: reinstate ISSUE_TEMPLATE.md for microsoft/git

  • 136: 8963c17 = 139: 1f4fc34 .github: update PULL_REQUEST_TEMPLATE.md

  • 140: e5f4e63 = 140: e827bc4 git_config_set_multivar_in_file_gently(): add a lock timeout

  • 138: a917c1b = 141: 74ddf0b Adjust README.md for microsoft/git

  • 141: 61ccf7f = 142: bd144f8 scalar: set the config write-lock timeout to 150ms

  • 142: dbe66fc = 143: b8035a1 scalar: add docs from microsoft/scalar

  • 143: 95acf54 = 144: 59f4b95 scalar (Windows): use forward slashes as directory separators

  • 144: 91321d4 = 145: e04c82b scalar: add retry logic to run_git()

  • 145: d6dc2ed = 146: 916c432 scalar: support the config command for backwards compatibility

  • 146: 9d553c9 = 147: a1a8744 scalar: implement a minimal JSON parser

  • 147: deb75bc = 148: 96ac103 scalar clone: support GVFS-enabled remote repositories

  • 148: b2c0d18 = 149: 2559da3 test-gvfs-protocol: also serve smart protocol

  • 149: 63ce837 = 150: e8e4251 gvfs-helper: add the endpoint command

  • 150: 090956a = 151: 007c22d dir_inside_of(): handle directory separators correctly

  • 151: d74636c = 152: d26e243 scalar: disable authentication in unattended mode

  • 152: 2e50db1 = 153: 6bae332 scalar: do initialize gvfs.sharedCache

  • 153: 69f523b = 154: b3c317e scalar diagnose: include shared cache info

  • 154: 531ca42 = 155: 0b806b5 scalar: only try GVFS protocol on https:// URLs

  • 155: fa01f7e = 156: 8f62ab7 scalar: verify that we can use a GVFS-enabled repository

  • 156: c9c6a46 = 157: 9d5cd82 scalar: add the cache-server command

  • 157: 143a1fc = 158: 217fdab scalar: add a test toggle to skip accessing the vsts/info endpoint

  • 158: e585d46 = 159: 1437866 scalar: adjust documentation to the microsoft/git fork

  • 159: a53b81a = 160: eacd5a0 scalar: enable untracked cache unconditionally

  • 160: dbf1a8f = 161: a3af3c2 scalar: parse clone --no-fetch-commits-and-trees for backwards compatibility

  • 161: 86cefa4 = 162: 2d923b0 scalar: make GVFS Protocol a forced choice

  • 162: 6b743c9 = 163: 1880963 scalar: work around GVFS Protocol HTTP/2 failures

  • 163: 7db74e8 = 164: 1b9c41a scalar diagnose: accommodate Scalar's Functional Tests

  • 164: 45c3638 = 165: 07af3a3 ci: run Scalar's Functional Tests

  • 165: ec0c379 = 166: a689d52 scalar: upgrade to newest FSMonitor config setting

  • 166: b730d02 = 167: 1f77217 abspath: make strip_last_path_component() global

  • 167: 0421912 = 168: b029a04 scalar: .scalarCache should live above enlistment

  • 168: 93be461 = 169: e3dce1d add/rm: allow adding sparse entries when virtual

  • 169: da03d5c = 170: f40961e sparse-checkout: add config to disable deleting dirs

  • 170: 87fdf7d = 171: 6d8eba0 diff: ignore sparse paths in diffstat

  • 171: b10c8d0 = 172: 725c9e2 repo-settings: enable sparse index by default

  • 172: d8837be = 173: 0dc5abd diff(sparse-index): verify with partially-sparse

  • 173: 4e82fc6 = 174: 58b2522 stash: expand testing for git stash -u

  • 174: 8064cd1 = 175: 5d6f2f6 sequencer: avoid progress when stderr is redirected

  • 175: 7923401 = 176: 40a9a47 sparse: add vfs-specific precautions

  • 176: 2dbee85 = 177: ef9e9e7 reset: fix mixed reset when using virtual filesystem

  • 177: d7f3123 = 178: eb92d8b sparse-index: add ensure_full_index_with_reason()

  • 178: 5c5f02c = 179: 312cee3 treewide: add reasons for expanding index

  • 179: e04a164 = 180: 2ed5e99 treewide: custom reasons for expanding index

  • 180: 4d17bf5 = 181: e816d9e sparse-index: add macro for unaudited expansions

  • 181: 932d616 = 182: b3c4b3a Docs: update sparse index plan with logging

  • 182: b5f02b6 = 183: 86ee1d2 sparse-index: log failure to clear skip-worktree

  • 183: e70382a = 184: 10f201b stash: use -f in checkout-index child process

  • 184: ab99783 = 185: 636faf3 sparse-index: do not copy hashtables during expansion

  • 185: 3b1ce85 = 186: eabc8df sparse-checkout: remove use of the_repository

  • 186: 22a91d4 = 187: 08cbe49 sparse-checkout: add basics of 'clean' command

  • 187: c0c9889 = 188: b40b74f sparse-checkout: match some 'clean' behavior

  • 188: 1c30b3a = 189: 1c55ee3 dir: add generic "walk all files" helper

  • 189: 58830f3 = 190: e1aba5d sparse-checkout: add --verbose option to 'clean'

  • 190: 881c1bd = 191: c0035c9 sparse-index: point users to new 'clean' action

  • 191: e78e256 = 192: 4bf26c6 t: expand tests around sparse merges and clean

  • 192: 368da85 = 193: 0503b78 sparse-checkout: make 'clean' clear more files

  • 193: 5582a4c = 194: 43c3da3 sparse-checkout: mark 'clean' as experimental

  • 194: 9bb8fe7 = 195: d19af00 sub-process: avoid leaking cmd

  • 195: 4a5bd61 = 196: 58ff48a remote-curl: release filter options before re-setting them

  • 196: 716e5c2 = 197: e5bf48f transport: release object filter options

  • 197: f23f7ff = 198: b53e5c2 push: don't reuse deltas with path walk

  • 198: 985a9db = 199: 99f167c t7900-maintenance.sh: reset config between tests

  • 199: 34c09f2 = 200: 7f890e6 maintenance: add cache-local-objects maintenance task

  • 200: 349d5d6 = 201: 107be90 scalar.c: add cache-local-objects task

  • 201: a08f8b6 = 202: f148e94 git.c: add VFS enabled cmd blocking

  • 202: 4c23a99 = 203: 8d18b51 git.c: permit repack cmd in Scalar repos

  • 203: de540de = 204: 514cbe6 git.c: permit fsck cmd in Scalar repos

  • 204: 2c23a9d = 205: 24b1abe git.c: permit prune cmd in Scalar repos

  • 207: ec1c36e = 206: 2b83a1d hooks: add custom post-command hook config

  • 205: 37ac30e = 207: b8cd956 worktree: remove special case GVFS cmd blocking

  • 208: f27c13d = 208: f7a55e9 Docs: fix asciidoc failures from short delimiters

  • 206: ed322f7 = 209: d5cd206 builtin/repack.c: emit warning when shared cache is present

  • 209: 3387603 = 210: e42a475 hooks: make hook logic memory-leak free

  • 210: 03567d7 = 211: 3f8ca8d t5309: create failing test for 'git index-pack'

  • 211: f7eb93e = 212: c1a3bc7 gvfs-helper: pass long values where expected

  • 212: 9ec64a5 = 213: 0dea22e gvfs-helper-client: clean up server process(es)

  • 214: 1109904 = 214: 66f6357 clar: pass a string for a %s format placeholder

  • 216: c75ecbe = 215: 2795fa7 clar(clar__assert_equal): do in-bounds check before accessing element

  • 218: 1e5258d = 216: fdd8377 clar(clar_summary_testsuite): avoid thread-unsafe localtime()

  • 213: 2c957ae = 217: 559396e cat_one_file(): make it easy to see that the size variable is initialized

  • 215: 8005a73 = 218: 13a1ed0 fsck: avoid using an uninitialized variable

  • 217: 8baf90b = 219: ae4a90a load_revindex_from_disk(): avoid accessing uninitialized data

  • 219: 27a1c22 = 220: 877e3a3 revision: defensive programming

  • 226: 1008bff = 221: 8046321 load_pack_mtimes_file(): avoid accessing uninitialized data

  • 220: 26aeaa9 = 222: 07ea788 get_parent(): defensive programming

  • 221: a703260 = 223: 5694bb5 fetch-pack: defensive programming

  • 222: 832134d = 224: 7120035 unparse_commit(): defensive programming

  • 223: 7522afa = 225: a8f231c verify_commit_graph(): defensive programming

  • 224: 279fdc3 = 226: f561dfc stash: defensive programming

  • 225: aa20303 = 227: 00e3d8d stash: defensive programming

  • 227: 0264f5d = 228: 0448b70 push: defensive programming

  • 229: da591b1 = 229: 7b22fca fetch: defensive programming

  • 228: 89c0034 = 230: c9e5c48 fetch: silence a CodeQL alert about a local variable's address' use after release

  • 231: 4d6bbf7 = 231: 23009bf describe: defensive programming

  • 230: 8da7719 = 232: c3fb973 submodule: check return value of submodule_from_path()

  • 233: 1fae381 = 233: 7d161d3 inherit_tracking(): defensive programming

  • 234: 3ac9553 = 234: 4b1d576 codeql: run static analysis as part of CI builds

  • 232: ed47e7c = 235: 0a1a0cf test-tool repository: check return value of lookup_commit()

  • 235: ca2d5f9 = 236: 137c9ca codeql: publish the sarif file as build artifact

  • 238: d11822d = 237: afefd5a shallow: handle missing shallow commits gracefully

  • 236: fcaa01f = 238: d2daf2d codeql: disable a couple of non-critical queries for now

  • 240: ac726a7 = 239: c01e438 commit-graph: suppress warning about using a stale stack addresses

  • 237: 4ac91c9 = 240: 3c93a94 date: help CodeQL understand that there are no leap-year issues here

  • 239: 2319d63 = 241: 800fb94 help: help CodeQL understand that consuming envvars is okay here

  • 241: eb19062 = 242: 1fe452c ctype: help CodeQL understand that sane_istest() does not access array past end

  • 242: 820f7f2 = 243: ecbecc9 ctype: accommodate for CodeQL misinterpreting the z in mallocz()

  • 243: 4173874 = 244: 69e0125 strbuf_read: help with CodeQL misunderstanding that strbuf_read() does NUL-terminate correctly

  • 244: b683611 = 245: 4fa44ec codeql: also check JavaScript code

@dscho dscho marked this pull request as ready for review August 19, 2025 16:20
Copy link

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super-clean range-diff. Almost suspiciously clean.

@dscho
Copy link
Member Author

dscho commented Aug 21, 2025

Super-clean range-diff. Almost suspiciously clean.

That's because all the messy bits were addressed already in the -rc0, -rc1 and -rc2 PRs ;-)

@dscho dscho marked this pull request as draft August 21, 2025 12:37
@dscho
Copy link
Member Author

dscho commented Aug 21, 2025

I actually broke the functionality of 9150f1d in my rebased d9b2dcd); Will fix, and add a regression test case (I actually have that already, locally).

mjcheetham and others added 2 commits August 22, 2025 09:44
codeql: run the CodeQL workflow every week

Run the CodeQL GitHub workflow every week, on Mondays at 03:00 UTC.
It's good practice to run this regularly as queries are updated.

[This was accidentally dropped from vfs-2.50.1]

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We need that test case to verify the functionality...

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho marked this pull request as ready for review August 22, 2025 10:44
Comment on lines +371 to +373
# move the loose objects into the shared objects as if they had been
# fetched via the `gvfs-helper`
mv gvfs-worktree/.git/objects/?? gvfs-shared/objects/ &&
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derrickstolee I know this is a blast from the past, but I do have a question: When you wrote 2864f7c#diff-a9031966d3e59230a29bf2a48c0902e7f74dfd1debf7f813b3880ca458af63a7L1026-R1031, the idea was to care about the loose objects downloaded by the gvfs-helper, right?

The problem I have with this (and which I only realized today, when I wrote this here regression test case in order to validate my rebased variant of your original 2864f7c): There are two places in Scalar enlistments where loose objects can exist, the shared objects and the repository-local objects/. The latter houses e.g. local commits. And those latter seem not to ever be repacked with 2864f7c (which remains true for my rebased variant).

Is this problematic in real-world enlistements? I guess it it a lot less important to take care of locally-generated Git objects (which are bandwidth-limited by the fact that they are generated by an individual human being) than to take care of potentially millions of hydrated Git objects from the remote repository. Is that the reasoning we should adopt here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants