-
Notifications
You must be signed in to change notification settings - Fork 105
Rebase onto Git for Windows 2.52.0-rc1 #812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Verify that the core.hooksPath configuration is repsected by the pre-command hook. Original regression test was written by Alejandro Pauly. Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Alejandro Pauly <alpauly@microsoft.com>
When using the sparse-checkout feature, the file might not be on disk because the skip-worktree bit is on. This used to be a bug in the (hence deleted) `recursive` strategy. Let's ensure that this bug does not resurface. Signed-off-by: Kevin Willford <kewillf@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The 'git worktree' command was marked as BLOCK_ON_GVFS_REPO because it does not interact well with the virtual filesystem of VFS for Git. When a Scalar clone uses the GVFS protocol, it enables the GVFS_BLOCK_COMMANDS flag, since commands like 'git gc' do not work well with the GVFS protocol. However, 'git worktree' works just fine with the GVFS protocol since it isn't doing anything special. It copies the sparse-checkout from the current worktree, so it does not have performance issues. This is a highly requested option. The solution is to stop using the BLOCK_ON_GVFS_REPO option and instead add a special-case check in cmd_worktree() specifically for a particular bit of the 'core_gvfs' global variable (loaded by very early config reading) that corresponds to the virtual filesystem. The bit that most closely resembled this behavior was non-obviously named, but does provide a signal that we are in a Scalar clone and not a VFS for Git clone. The error message is copied from git.c, so it will have the same output as before if a user runs this in a VFS for Git clone. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
When using the sparse-checkout feature git should not write to the working directory for files with the skip-worktree bit on. With the skip-worktree bit on the file may or may not be in the working directory and if it is not we don't want or need to create it by calling checkout_entry. There are two callers of checkout_target. Both of which check that the file does not exist before calling checkout_target. load_current which make a call to lstat right before calling checkout_target and check_preimage which will only run checkout_taret it stat_ret is less than zero. It sets stat_ret to zero and only if !stat->cached will it lstat the file and set stat_ret to something other than zero. This patch checks if skip-worktree bit is on in checkout_target and just returns so that the entry doesn't not end up in the working directory. This is so that apply will not create a file in the working directory, then update the index but not keep the working directory up to date with the changes that happened in the index. Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Add the ability to block built-in commands based on if the `core.gvfs` setting has the `GVFS_USE_VIRTUAL_FILESYSTEM` bit set. This allows us to selectively block commands that use the GVFS protocol, but don't use VFS for Git (for example repos cloned via `scalar clone` against Azure DevOps). Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
While performing a fetch with a virtual file system we know that there will be missing objects and we don't want to download them just because of the reachability of the commits. We also don't want to download a pack file with commits, trees, and blobs since these will be downloaded on demand. This flag will skip the first connectivity check and by returning zero will skip the upload pack. It will also skip the second connectivity check but continue to update the branches to the latest commit ids. Signed-off-by: Kevin Willford <kewillf@microsoft.com>
As of 9e59b38 (object-file: emit corruption errors when detected, 2022-12-14), Git will loudly complain about corrupt objects. That is fine, as long as the idea isn't to re-download locally-corrupted objects. But that's exactly what we want to do in VFS for Git via the `read-object` hook, as per the `GitCorruptObjectTests` code added in microsoft/VFSForGit@2db0c030eb25 (New features: [...] - GVFS can now recover from corrupted git object files [...] , 2018-02-16). So let's support precisely that, and add a regression test that ensures that re-downloading corrupt objects via the `read-object` hook works. While at it, avoid the XOR operator to flip the bits, when we actually want to make sure that they are turned off: Use the AND-NOT operator for that purpose. Helped-by: Matthew John Cheetham <mjcheetham@outlook.com> Helped-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Loosen the blocking of the `repack` command from all "GVFS repos" (those that have `core.gvfs` set) to only those that actually use the virtual file system (VFS for Git only). This allows for `repack` to be used in Scalar clones. Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Ensure all filters and EOL conversions are blocked when running under GVFS so that our projected file sizes will match the actual file size when it is hydrated on the local machine. Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
String formatting can be a performance issue when there are hundreds of thousands of trees. Change to stop using the strbuf_addf and just add the strings or characters individually. There are a limited number of modes so added a switch for the known ones and a default case if something comes through that are not a known one for git. In one scenario regarding a huge worktree, this reduces the time required for a `git checkout <branch>` from 44 seconds to 38 seconds, i.e. it is a non-negligible performance improvement. Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Loosen the blocking of the `fsck` command from all "GVFS repos" (those that have `core.gvfs` set) to only those that actually use the virtual file system (VFS for Git only). This allows for `fsck` to be used in Scalar clones. Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
The idea is to allow blob objects to be missing from the local repository, and to load them lazily on demand. After discussing this idea on the mailing list, we will rename the feature to "lazy clone" and work more on this. Signed-off-by: Ben Peart <Ben.Peart@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The following commands and options are not currently supported when working in a GVFS repo. Add code to detect and block these commands from executing. 1) fsck 2) gc 4) prune 5) repack 6) submodule 8) update-index --split-index 9) update-index --index-version (other than 4) 10) update-index --[no-]skip-worktree 11) worktree Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Loosen the blocking of the `prune` command from all "GVFS repos" (those that have `core.gvfs` set) to only those that actually use the virtual file system (VFS for Git only). This allows for `prune` to be used in Scalar clones. Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
On index load, clear/set the skip worktree bits based on the virtual file system data. Use virtual file system data to update skip-worktree bit in unpack-trees. Use virtual file system data to exclude files and folders not explicitly requested. Update 2022-04-05: disable the "present-despite-SKIP_WORKTREE" file removal behavior when 'core.virtualfilesystem' is enabled. Signed-off-by: Ben Peart <benpeart@microsoft.com>
Hydrate missing loose objects in check_and_freshen() when running virtualized. Add test cases to verify read-object hook works when running virtualized. This hook is called in check_and_freshen() rather than check_and_freshen_local() to make the hook work also with alternates. Helped-by: Kevin Willford <kewillf@microsoft.com> Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
In earlier versions of `microsoft/git`, we found a user who had set `core.gvfs = false` in their global config. This should not have been necessary, but it also should not have caused a problem. However, it did. The reason was that `gvfs_load_config_value()` was called from `config.c` when reading config key/value pairs from all the config files. The local config should override the global config, and this is done by `config.c` reading the global config first then reading the local config. However, our logic only allowed writing the `core_gvfs` variable once. In v2.51.0, we had to adapt to upstream changes that changed way the `core.gvfs` config value is read, and the special handling is no longer necessary, yet we still want the test case that ensures that this bug does not experience a regression. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Replace the special casing of the `worktree` command being blocked on VFS-enabled repos with the new `BLOCK_ON_VFS_ENABLED` flag. Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
…x has been redirected Fixes #13 Some git commands spawn helpers and redirect the index to a different location. These include "difftool -d" and the sequencer (i.e. `git rebase -i`, `git cherry-pick` and `git revert`) and others. In those instances we don't want to update their temporary index with our virtualization data. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
If we are going to write an object there is no use in calling the read object hook to get an object from a potentially remote source. We would rather just write out the object and avoid the potential round trip for an object that doesn't exist. This change adds a flag to the check_and_freshen() and freshen_loose_object() functions' signatures so that the hook is bypassed when the functions are called before writing loose objects. The check for a local object is still performed so we don't overwrite something that has already been written to one of the objects directories. Based on a patch by Kevin Willford. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Emit a warning message when the `gvfs.sharedCache` option is set that the `repack` command will not perform repacking on the shared cache. In the future we can teach `repack` to operate on the shared cache, at which point we can drop this commit. Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Add trace2 region around read_object_process to collect time spent waiting for missing objects to be dynamically fetched. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace2 region and data events describing attempts to deserialize
status data using a status cache.
A category:status, label:deserialize region is pushed around the
deserialize code.
Deserialization results when reading from a file are:
category:status, path = <path>
category:status, polled = <number_of_attempts>
category:status, result = "ok" | "reject"
When reading from STDIN are:
category:status, path = "STDIN"
category:status, result = "ok" | "reject"
Status will fallback and run a normal status scan when a "reject"
is reported (unless "--deserialize-wait=fail").
If "ok" is reported, status was able to use the status cache and
avoid scanning the workdir.
Additionally, a cmd_mode is emitted for each step: collection,
deserialization, and serialization. For example, if deserialization
is attempted and fails and status falls back to actually computing
the status, a cmd_mode message containing "deserialize" is issued
and then a cmd_mode for "collect" is issued.
Also, if deserialization fails, a data message containing the
rejection reason is emitted.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Add trace information around status serialization. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Report virtual filesystem summary data. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach STATUS to optionally serialize the results of a status computation to a file. Teach STATUS to optionally read an existing serialization file and simply print the results, rather than actually scanning. This is intended for immediate status results on extremely large repos and assumes the use of a service/daemon to maintain a fresh current status snapshot. 2021-10-30: packet_read() changed its prototype in ec9a37d (pkt-line.[ch]: remove unused packet_read_line_buf(), 2021-10-14). 2021-10-30: sscanf() now does an extra check that "%d" goes into an "int" and complains about "uint32_t". Replacing with "%u" fixes the compile-time error. 2021-10-30: string_list_init() was removed by abf897b (string-list.[ch]: remove string_list_init() compatibility function, 2021-09-28), so we need to initialize manually. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
A few places where CodeQL thinks that variables might be uninitialized. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…oes NUL-terminate correctly Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The code is a bit too hard to reason about for CodeQL to figure out whether the `fill_commit_graph_info()` function is at all called after `write_commit_graph()` returns (and hence whether `topo_levels` goes out of context before it is used again). The Git project insists that this is correct (and does not want to make the code more obviously correct), so let's silence CodeQL's complaints in this instance. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
These patches implement some defensive programming to address complaints some static analyzers might have. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Let's exclude GitWeb from being scanned; It is not distributed by us. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL pointed out a couple of issues, which are addressed in this patch series. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch series has been long in the making, ever since Johannes Nicolai and myself spiked this in November/December 2020. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The microsoft/git fork includes pre- and post-command hooks, with the
initial intention of using these for VFS for Git. In that environment,
these are important hooks to avoid concurrent issues when the
virtualization is incomplete.
However, in the Office monorepo the post-command hook is used in a
different way. A custom hook is used to update the sparse-checkout, if
necessary. To avoid this hook from being incredibly slow on every Git
command, this hook checks for the existence of a "sentinel file" that is
written by a custom post-index-change hook and no-ops if that file does
not exist.
However, even this "no-op" is 200ms due to the use of two scripts (one
simple script in .git/hooks/ does some environment checking and then
calls a script from the working directory which actually contains the
logic).
Add a new config option, 'postCommand.strategy', that will allow for
multiple possible strategies in the future. For now, the one we are
adding is 'post-index-change' which states that we should write a
sentinel file instead of running the 'post-index-change' hook and then
skip the 'post-command' hook if the proper sentinel file doesn't exist.
(If it does exist, then delete it and run the hook.)
---
This fork contains changes specific to monorepo scenarios. If you are an
external contributor, then please detail your reason for submitting to
this fork:
* [ ] This is an early version of work already under review upstream.
* [ ] This change only applies to interactions with Azure DevOps and the
GVFS Protocol.
* [ ] This change only applies to the virtualization hook and VFS for
Git.
* [x] This change only applies to custom bits in the microsoft/git fork.
This patch series has been long in the making, ever since Johannes Nicolai and myself spiked this in November/December 2020. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Range-diff relative to v2.52.0-rc0
|
Let's output the URL of the release that was just created; That will make it more convenient to go from finished workflow run to publishing the release. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
|
Note: I slipped in 7c92b3b, which adds a convenient message to the |
When the top-level Git process is an alias, it doesn't load much config and thus the postCommand.strategy config setting is not loaded properly. This leads to the post-command hook being run more frequently than we want, including an alias for 'git add' running the hook even when the worktree did not change. Similar state is not loaded by 'git version' or 'git typo'. Signed-off-by: Derrick Stolee <stolee@gmail.com>
When the post-command hook runs, we could be in several custom states: 1. The command is a regular builtin, but the repo setup is incomplete. (Example: "git version", "git rev-parse HEAD") 2. The command is a dashed command, but the top level process uses a space in its call. (Example: "git http-request") 3. The command is an alias that goes to a builtin. 4. The command has no arguments or is only helptext. Each of these cases leads to a place where we previously had not loaded the postCommand.strategy config and would execute the post-command hook without looking for a sentinel file. There are two fixes here: First, we use early config parsing so we can get config details without fully setting up the repository. This is how core.hooksPath works in these situations. Second, we need to make sure handle_hook_replacement() is called even when the repo's gitdir is NULL. This requires a small amount of care to say that a sentinel file cannot exist if the gitdir isn't set, as we would not have written one without initializing the gitdir. This gets all of the desired behaviors we want for this setting without doing anything extreme with how pre- and post-command hooks run otherwise. Signed-off-by: Derrick Stolee <stolee@gmail.com>
The postCommand.strategy=worktree-change config option allows for skipping the post-command hook when the Git command doesn't change the worktree. However, sometimes this config isn't loaded before the post-command hook is invoked, causing the hook to run in cases where we'd prefer it to not run. Examples include: `git version` or `git <typo>`. The tricky bit is that there are several places where we get here and standard config mechanisms can't load due to not having a `gitdir` in the repository struct. We fix this by: 1. Using the "load early config" mechanism also used by `core.hooksPath`. 2. Skipping the lookup for the sentinel file when there isn't a `gitdir` since we couldn't have written one without it. The change is given in two commits: the first expands the tests with the existing behavior and the second changes the behavior, showing the impact on those tests. * [X] This change only applies to microsoft/git specifics (post-command hook) See #736 and #748 for prior art in this space.
The usual thing. See git-for-windows#5937.