Skip to content

Commit 2fb2175

Browse files
committed
sparse-checkout: clear tracked sparse dirs
When changing the scope of a sparse-checkout using cone mode, we might have some tracked directories go out of scope. The current logic removes the tracked files from within those directories, but leaves the ignored files within those directories. This is a bit unexpected to users who have given input to Git saying they don't need those directories anymore. This is something that is new to the cone mode pattern type: the user has explicitly said "I want these directories and _not_ those directories." The typical sparse-checkout patterns more generally apply to "I want files with with these patterns" so it is natural to leave ignored files as they are. This focus on directories in cone mode provides us an opportunity to change the behavior. Leaving these ignored files in the sparse directories makes it impossible to gain performance benefits in the sparse index. When we track into these directories, we need to know if the files are ignored or not, which might depend on the _tracked_ .gitignore file(s) within the sparse directory. This depends on the indexed version of the file, so the sparse directory must be expanded. By deleting the sparse directories when changing scope (or running 'git sparse-checkout reapply') we regain these performance benefits as if the repository was in a clean state. Since these ignored files are frequently build output or helper files from IDEs, the users should not need the files now that the tracked files are removed. If the tracked files reappear, then they will have newer timestamps than the build artifacts, so the artifacts will need to be regenerated anyway. If users depend on ignored files within the sparse directories, then they have created a bad shape in their repository. Regardless, such shapes would create risk that changing the behavior for all cone mode users might be too risky to take on at the moment. Since this data shape makes it impossible to get performance benefits using the sparse index, we limit the change to only be enabled when the sparse index is enabled. Users can opt out of this behavior by disabline the sparse index. Depending on user feedback or real-world use, we might want to consider expanding the behavior change to all of cone mode. Since we are currently restricting to the sparse index case, we can use the existence of sparse directory entries in the index as indicators of which directories should be removed. The contained files are deleted by a 'git clean -dfx' subcommand while the directories themselves are deleted once they are empty. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
1 parent adf5b15 commit 2fb2175

File tree

3 files changed

+126
-0
lines changed

3 files changed

+126
-0
lines changed

Documentation/git-sparse-checkout.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,12 @@ case-insensitive check. This corrects for case mismatched filenames in the
210210
'git sparse-checkout set' command to reflect the expected cone in the working
211211
directory.
212212

213+
When the sparse index is enabled through the `index.sparse` config option,
214+
the cone mode sparse-checkout patterns will also remove ignored files that
215+
are not within the sparse-checkout definition. This is important behavior
216+
to preserve the performance of the sparse index, but also matches that
217+
cone mode patterns care about directories, not files.
218+
213219

214220
SUBMODULES
215221
----------

builtin/sparse-checkout.c

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
#include "wt-status.h"
1616
#include "quote.h"
1717
#include "sparse-index.h"
18+
#include "run-command.h"
1819

1920
static const char *empty_base = "";
2021

@@ -100,6 +101,76 @@ static int sparse_checkout_list(int argc, const char **argv)
100101
return 0;
101102
}
102103

104+
static void clean_tracked_sparse_directories(struct repository *r)
105+
{
106+
int i;
107+
struct strvec args = STRVEC_INIT;
108+
109+
/*
110+
* If we are not using cone mode patterns, then we cannot
111+
* delete directories outside of the sparse cone.
112+
*/
113+
if (!r || !r->index || !r->index->sparse_checkout_patterns ||
114+
!r->index->sparse_checkout_patterns->use_cone_patterns)
115+
return;
116+
/*
117+
* NEEDSWORK: For now, only use this behavior when index.sparse
118+
* is enabled. We may want this behavior enabled whenever using
119+
* cone mode patterns.
120+
*/
121+
prepare_repo_settings(r);
122+
if (!r->settings.sparse_index)
123+
return;
124+
125+
strvec_pushl(&args, "clean", "-dfx", "--", NULL);
126+
127+
/*
128+
* Since we now depend on the sparse index to enable this
129+
* behavior, use it to our advantage. This process is more
130+
* complicated without it.
131+
*/
132+
convert_to_sparse(r->index);
133+
134+
for (i = 0; i < r->index->cache_nr; i++) {
135+
struct cache_entry *ce = r->index->cache[i];
136+
137+
/*
138+
* Is this a sparse directory? If so, then definitely
139+
* include it. All contained content is outside of the
140+
* patterns.
141+
*/
142+
if (S_ISSPARSEDIR(ce->ce_mode) &&
143+
repo_file_exists(r, ce->name)) {
144+
strvec_push(&args, ce->name);
145+
continue;
146+
}
147+
}
148+
149+
/*
150+
* Only run if we found an existing sparse directory, otherwise
151+
* the clean will be across the entire worktree!
152+
*/
153+
if (args.nr > 3)
154+
run_command_v_opt(args.v, RUN_GIT_CMD);
155+
156+
/*
157+
* The 'git clean -dfx -- <path> ...' command empties the
158+
* tracked directories outside of the sparse cone, but does not
159+
* delete the directories themselves. Remove them now.
160+
*/
161+
for (i = 3; i < args.nr; i++)
162+
rmdir_or_warn(args.v[i]);
163+
164+
strvec_clear(&args);
165+
166+
/*
167+
* This is temporary: the sparse-checkout builtin is not
168+
* integrated with the sparse-index yet, so we need to keep
169+
* it full during the process.
170+
*/
171+
ensure_full_index(r->index);
172+
}
173+
103174
static int update_working_directory(struct pattern_list *pl)
104175
{
105176
enum update_sparsity_result result;
@@ -141,6 +212,8 @@ static int update_working_directory(struct pattern_list *pl)
141212
else
142213
rollback_lock_file(&lock_file);
143214

215+
clean_tracked_sparse_directories(r);
216+
144217
r->index->sparse_checkout_patterns = NULL;
145218
return result;
146219
}
@@ -540,8 +613,11 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m)
540613
{
541614
int result;
542615
int changed_config = 0;
616+
struct pattern_list *old_pl = xcalloc(1, sizeof(*old_pl));
543617
struct pattern_list *pl = xcalloc(1, sizeof(*pl));
544618

619+
get_sparse_checkout_patterns(old_pl);
620+
545621
switch (m) {
546622
case ADD:
547623
if (core_sparse_checkout_cone)
@@ -567,7 +643,9 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m)
567643
set_config(MODE_NO_PATTERNS);
568644

569645
clear_pattern_list(pl);
646+
clear_pattern_list(old_pl);
570647
free(pl);
648+
free(old_pl);
571649
return result;
572650
}
573651

t/t1091-sparse-checkout-builtin.sh

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -642,4 +642,46 @@ test_expect_success MINGW 'cone mode replaces backslashes with slashes' '
642642
check_files repo/deep a deeper1
643643
'
644644

645+
test_expect_success 'cone mode clears ignored subdirectories' '
646+
rm repo/.git/info/sparse-checkout &&
647+
648+
# NEEDSWORK: --sparse-index is required for now
649+
git -C repo sparse-checkout init --cone --sparse-index &&
650+
git -C repo sparse-checkout set deep/deeper1 &&
651+
652+
cat >repo/.gitignore <<-\EOF &&
653+
obj/
654+
*.o
655+
EOF
656+
657+
git -C repo add .gitignore &&
658+
git -C repo commit -m ".gitignore" &&
659+
660+
mkdir -p repo/obj repo/folder1/obj repo/deep/deeper2/obj &&
661+
for file in folder1/obj/a obj/a folder1/file.o folder1.o \
662+
deep/deeper2/obj/a deep/deeper2/file.o file.o
663+
do
664+
echo ignored >repo/$file || return 1
665+
done &&
666+
667+
git -C repo status --porcelain=v2 >out &&
668+
test_must_be_empty out &&
669+
670+
git -C repo sparse-checkout reapply &&
671+
test_path_is_missing repo/folder1 &&
672+
test_path_is_missing repo/deep/deeper2 &&
673+
test_path_is_dir repo/obj &&
674+
test_path_is_file repo/file.o &&
675+
676+
git -C repo status --porcelain=v2 >out &&
677+
test_must_be_empty out &&
678+
679+
git -C repo sparse-checkout set deep/deeper2 &&
680+
test_path_is_missing repo/deep/deeper1 &&
681+
test_path_is_dir repo/deep/deeper2 &&
682+
683+
git -C repo status --porcelain=v2 >out &&
684+
test_must_be_empty out
685+
'
686+
645687
test_done

0 commit comments

Comments
 (0)