Skip to content

Commit 74e8add

Browse files
szedergitster
authored andcommitted
split-index: add tests to demonstrate the racy split index problem
Ever since the split index feature was introduced [1], refreshing a split index is prone to a variant of the classic racy git problem. There are a couple of unrelated tests in the test suite that occasionally fail when run with 'GIT_TEST_SPLIT_INDEX=yes', but 't1700-split-index.sh', the only test script focusing solely on split index, has never noticed this issue, because it only cares about how the index is split under various circumstances and all the different ways to turn the split index feature on and off. Add a dedicated test script 't1701-racy-split-index.sh' to exercise the split index feature in racy situations as well; kind of a "t0010-racy-git.sh for split index" but with modern style (the tests do everything in &&-chained list of commands in 'test_expect_...' blocks, and use 'test_cmp' for more informative output on failure). The tests cover the following sequences of index splitting, updating, and racy file modifications, with the last two cases demonstrating the racy split index problem: 1. Split the index while adding a racily clean file: echo "cached content" >file git update-index --split-index --add file echo "dirty worktree" >file # size stays the same This case already works properly. Even though the cache entry's stat data matches with the modifid file in the worktree, subsequent git commands will notice that the (split) index and the file have the same mtime, and then will go on to check the file's content and notice its dirtiness. 2. Add a racily clean file to an already split index: git update-index --split-index echo "cached content" >file git update-index --add file echo "dirty worktree" >file This case already works properly. After the second 'git update-index' writes the newly added file's cache entry to the new split index, it basically works in the same way as case git-for-windows#1. 3. Split the index when it (i.e. the not yet splitted index) contains a racily clean cache entry, i.e. an entry whose cached stat data matches with the corresponding file in the worktree and the cached mtime matches that of the index: echo "cached content" >file git update-index --add file echo "dirty worktree" >file # ... wait ... git update-index --split-index --add other-file This case already works properly. The shared index is written by do_write_index(), i.e. the same function that is responsible for writing "regular" and split indexes as well. This function cleverly notices the racily clean cache entry, and writes the entry to the new shared index with smudged stat data, i.e. file size set to 0. When subsequent git commands read the index, they will notice that the smudged stat data doesn't match with the file in the worktree, and then go on to check the file's content and notice its dirtiness. 4. Update the split index when it contains a racily clean cache entry: git update-index --split-index echo "cached content" >file git update-index --add file echo "dirty worktree" >file # ... wait ... git update-index --add other-file This case already works properly. After the second 'git update-index' the newly added file's cache entry is only stored in the split index. If a cache entry is present in the split index (even if it is a replacement of an outdated entry in the shared index), then it will always be included in the new split index on subsequent split index updates (until the file is removed or a new shared index is written), independently from whether the entry is racily clean or not. When do_write_index() writes the new split index, it notices the racily clean cache entry, and smudges its stat date. Subsequent git commands reading the index will notice the smudged stat data and then go on to check the file's content and notice its dirtiness. 5. Update the split index when a racily clean cache entry is stored only in the shared index: echo "cached content" >file git update-index --split-index --add file echo "dirty worktree" >file # ... wait ... git update-index --add other-file This case fails due to the racy split index problem. In the second 'git update-index' prepare_to_write_split_index() decides, among other things, which cache entries stored only in the shared index should be replaced in the new split index. Alas, this function never looks out for racily clean cache entries, and since the file's stat data in the worktree hasn't changed since the shared index was written, the entry won't be replaced in the new split index. Consequently, do_write_index() doesn't even get this racily clean cache entry, and can't smudge its stat data. Subsequent git commands will then see that the index has more recent mtime than the file and that the (not smudged) cached stat data still matches with the file in the worktree, and, ultimately, will erroneously consider the file clean. 6. Update the split index after unpack_trees() copied a racily clean cache entry from the shared index: echo "cached content" >file git update-index --split-index --add file echo "dirty worktree" >file # ... wait ... git read-tree -m HEAD This case fails due to the racy split index problem. This basically fails for the same reason as case git-for-windows#5 above, but there is one important difference, which warrants the dedicated test. While that second 'git update-index' in case git-for-windows#5 updates index_state in place, in this case 'git read-tree -m' calls unpack_trees(), which throws out the entire index, and constructs a new one from the (potentially updated) copies of the original's cache entries. Consequently, when prepare_to_write_split_index() gets to work on this reconstructed index, it takes a different code path than in case git-for-windows#5 when deciding which cache entries in the shared index should be replaced. The result is the same, though: the racily clean cache entry goes unnoticed, it isn't added to the split index with smudged stat data, and subsequent git commands will then erroneously consider the file clean. Note that in the last two 'test_expect_failure' cases I omitted the '#' (as in nr. of trial) from the tests' description on purpose for now, as it breakes the TAP output [2]; it will be added at the end of the series, when those two tests will be flipped to 'test_expect_success'. [1] In the branch leading to the merge commit v2.1.0-rc0~45 (Merge branch 'nd/split-index', 2014-07-16). [2] In the TAP output a '#' should separate the test's description from the TODO directive emitted by 'test_expect_failure'. The additional '#' in "#$trial" interferes with this, the test harness won't recognize the TODO directive, and will report that those tests failed unexpectedly. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 parent 18c765e commit 74e8add

File tree

1 file changed

+218
-0
lines changed

1 file changed

+218
-0
lines changed

t/t1701-racy-split-index.sh

Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
#!/bin/sh
2+
3+
# This test can give false success if your machine is sufficiently
4+
# slow or all trials happened to happen on second boundaries.
5+
6+
test_description='racy split index'
7+
8+
. ./test-lib.sh
9+
10+
test_expect_success 'setup' '
11+
# Only split the index when the test explicitly says so.
12+
sane_unset GIT_TEST_SPLIT_INDEX &&
13+
git config splitIndex.maxPercentChange 100 &&
14+
15+
echo "cached content" >racy-file &&
16+
git add racy-file &&
17+
git commit -m initial &&
18+
19+
echo something >other-file &&
20+
# No raciness with this file.
21+
test-tool chmtime =-20 other-file &&
22+
23+
echo "+cached content" >expect
24+
'
25+
26+
check_cached_diff () {
27+
git diff-index --patch --cached $EMPTY_TREE racy-file >diff &&
28+
tail -1 diff >actual &&
29+
test_cmp expect actual
30+
}
31+
32+
trials="0 1 2 3 4"
33+
for trial in $trials
34+
do
35+
test_expect_success "split the index while adding a racily clean file #$trial" '
36+
rm -f .git/index .git/sharedindex.* &&
37+
38+
# The next three commands must be run within the same
39+
# second (so both writes to racy-file result in the same
40+
# mtime) to create the interesting racy situation.
41+
echo "cached content" >racy-file &&
42+
43+
# Update and split the index. The cache entry of
44+
# racy-file will be stored only in the shared index.
45+
git update-index --split-index --add racy-file &&
46+
47+
# File size must stay the same.
48+
echo "dirty worktree" >racy-file &&
49+
50+
# Subsequent git commands should notice that racy-file
51+
# and the split index have the same mtime, and check
52+
# the content of the file to see if it is actually
53+
# clean.
54+
check_cached_diff
55+
'
56+
done
57+
58+
for trial in $trials
59+
do
60+
test_expect_success "add a racily clean file to an already split index #$trial" '
61+
rm -f .git/index .git/sharedindex.* &&
62+
63+
git update-index --split-index &&
64+
65+
# The next three commands must be run within the same
66+
# second.
67+
echo "cached content" >racy-file &&
68+
69+
# Update the split index. The cache entry of racy-file
70+
# will be stored only in the split index.
71+
git update-index --add racy-file &&
72+
73+
# File size must stay the same.
74+
echo "dirty worktree" >racy-file &&
75+
76+
# Subsequent git commands should notice that racy-file
77+
# and the split index have the same mtime, and check
78+
# the content of the file to see if it is actually
79+
# clean.
80+
check_cached_diff
81+
'
82+
done
83+
84+
for trial in $trials
85+
do
86+
test_expect_success "split the index when the index contains a racily clean cache entry #$trial" '
87+
rm -f .git/index .git/sharedindex.* &&
88+
89+
# The next three commands must be run within the same
90+
# second.
91+
echo "cached content" >racy-file &&
92+
93+
git update-index --add racy-file &&
94+
95+
# File size must stay the same.
96+
echo "dirty worktree" >racy-file &&
97+
98+
# Now wait a bit to ensure that the split index written
99+
# below will get a more recent mtime than racy-file.
100+
sleep 1 &&
101+
102+
# Update and split the index when the index contains
103+
# the racily clean cache entry of racy-file.
104+
# A corresponding replacement cache entry with smudged
105+
# stat data should be added to the new split index.
106+
git update-index --split-index --add other-file &&
107+
108+
# Subsequent git commands should notice the smudged
109+
# stat data in the replacement cache entry and that it
110+
# doesnt match with the file the worktree.
111+
check_cached_diff
112+
'
113+
done
114+
115+
for trial in $trials
116+
do
117+
test_expect_success "update the split index when it contains a new racily clean cache entry #$trial" '
118+
rm -f .git/index .git/sharedindex.* &&
119+
120+
git update-index --split-index &&
121+
122+
# The next three commands must be run within the same
123+
# second.
124+
echo "cached content" >racy-file &&
125+
126+
# Update the split index. The cache entry of racy-file
127+
# will be stored only in the split index.
128+
git update-index --add racy-file &&
129+
130+
# File size must stay the same.
131+
echo "dirty worktree" >racy-file &&
132+
133+
# Now wait a bit to ensure that the split index written
134+
# below will get a more recent mtime than racy-file.
135+
sleep 1 &&
136+
137+
# Update the split index when the racily clean cache
138+
# entry of racy-file is only stored in the split index.
139+
# An updated cache entry with smudged stat data should
140+
# be added to the new split index.
141+
git update-index --add other-file &&
142+
143+
# Subsequent git commands should notice the smudged
144+
# stat data.
145+
check_cached_diff
146+
'
147+
done
148+
149+
for trial in $trials
150+
do
151+
test_expect_failure "update the split index when a racily clean cache entry is stored only in the shared index $trial" '
152+
rm -f .git/index .git/sharedindex.* &&
153+
154+
# The next three commands must be run within the same
155+
# second.
156+
echo "cached content" >racy-file &&
157+
158+
# Update and split the index. The cache entry of
159+
# racy-file will be stored only in the shared index.
160+
git update-index --split-index --add racy-file &&
161+
162+
# File size must stay the same.
163+
echo "dirty worktree" >racy-file &&
164+
165+
# Now wait a bit to ensure that the split index written
166+
# below will get a more recent mtime than racy-file.
167+
sleep 1 &&
168+
169+
# Update the split index when the racily clean cache
170+
# entry of racy-file is only stored in the shared index.
171+
# A corresponding replacement cache entry with smudged
172+
# stat data should be added to the new split index.
173+
#
174+
# Alas, such a smudged replacement entry is not added!
175+
git update-index --add other-file &&
176+
177+
# Subsequent git commands should notice the smudged
178+
# stat data.
179+
check_cached_diff
180+
'
181+
done
182+
183+
for trial in $trials
184+
do
185+
test_expect_failure "update the split index after unpack trees() copied a racily clean cache entry from the shared index $trial" '
186+
rm -f .git/index .git/sharedindex.* &&
187+
188+
# The next three commands must be run within the same
189+
# second.
190+
echo "cached content" >racy-file &&
191+
192+
# Update and split the index. The cache entry of
193+
# racy-file will be stored only in the shared index.
194+
git update-index --split-index --add racy-file &&
195+
196+
# File size must stay the same.
197+
echo "dirty worktree" >racy-file &&
198+
199+
# Now wait a bit to ensure that the split index written
200+
# below will get a more recent mtime than racy-file.
201+
sleep 1 &&
202+
203+
# Update the split index after unpack_trees() copied the
204+
# racily clean cache entry of racy-file from the shared
205+
# index. A corresponding replacement cache entry
206+
# with smudged stat data should be added to the new
207+
# split index.
208+
#
209+
# Alas, such a smudged replacement entry is not added!
210+
git read-tree -m HEAD &&
211+
212+
# Subsequent git commands should notice the smudged
213+
# stat data.
214+
check_cached_diff
215+
'
216+
done
217+
218+
test_done

0 commit comments

Comments
 (0)