Web lists-archives.com

[RFC PATCH 3/5] split index: add a test to demonstrate the racy split index problem




Ever since the split index feature was introduced [1], refreshing a
split index is prone to a variant of the classic racy git problem.
There are a couple of unrelated tests in the test suite that
occasionally fail when run with 'GIT_TEST_SPLIT_INDEX=yes', but
't1700-split-index.sh', the only test script focusing solely on split
index, has never noticed this issue, because it only cares about how
the index is split under various circumstances and all the different
ways to turn the split index feature on and off.

Add a dedicated test script 't1701-racy-split-index.sh' to exercise
the split index feature in racy situations as well; kind of a
"t0010-racy-git.sh for split index" but with modern style (the tests
do everything in &&-chained list of commands in 'test_expect_...'
blocks, and use 'test_cmp' for more informative output on failure).

The tests cover the following sequences of index splitting, updating,
and racy file modifications, with case #3 demonstrating the racy split
index problem:

  1. Split the index when the worktree contains a racily clean file:

       echo "cached content" >file
       git update-index --split-index --add file
       echo "dirty worktree" >file    # size stays the same

     This case already works properly.  Even though the cache entry's
     stat data matches with the file in the worktree, subsequent git
     commands will notice that the (split) index and the file have the
     same mtime, and then will go on to check the file's content.

  2. Split the index when it (i.e. the not yet splitted index)
     contains a racily clean cache entry, i.e. an entry whose cached
     stat data matches with the corresponding file in the worktree and
     the cached mtime matches that of the index:

       echo "cached content" >file
       git update-index --add file
       echo "dirty worktree" >file
       # ... wait ...
       git update-index --split-index --add other-file

     This case already works properly.  The shared index is written by
     do_write_index(), i.e. the same function that is responsible for
     writing "regular" and split indexes as well.  This function
     cleverly notices the racily clean cache entry, and writes the
     entry to the new shared index with smudged stat data, i.e. file
     size set to 0.  When subsequent git commands read the index, they
     will notice that the smudged stat data doesn't match with the
     file in the worktree, and then go on to check the file's content.

  3. Update the split index when the shared index contains a racily
     clean cache entry:

       echo "cached content" >file
       git update-index --split-index --add file
       echo "dirty worktree" >file
       # ... wait ...
       git update-index --add other-file

     This case fails due to the racy split index problem.  In the
     second 'git update-index' prepare_to_write_split_index() gathers
     all cache entries that should be written to the new split index.
     Alas, this function never looks out for racily clean cache
     entries, and since the file's stat data in the worktree hasn't
     changed since the shared index was written, the entry won't be
     replaced in the new split index.  Consequently, do_write_index()
     doesn't even get this racily clean cache entry, and can't smudge
     its stat data.  Subsequent git commands will then see that the
     index has more recent mtime than the file and that the (not
     smudged) cached stat data still matches with the file in the
     worktree, and, ultimately, will erroneously consider the file
     clean.

  4. Add a racily clean file to an already split index:

       git update-index --split-index
       echo "cached content" >file
       git update-index --add file
       echo "dirty worktree" >file

     This case already works properly.  After the second 'git
     update-index' writes the newly added file's cache entry to the
     new split index, it basically works in the same way as case #1.

  5. Update the split index when it contains a racily clean cache
     entry:

       git update-index --split-index
       echo "cached content" >file
       git update-index --add file
       echo "dirty worktree" >file
       # ... wait ...
       git update-index --add other-file

     This case already works properly.  Since the second 'git
     update-index' writes the newly added file's cache entry to the
     split index, the third 'git update-index's do_write_index() now
     has a chance to look at this entry, notices its raciness, and
     writes smudged stat data to the new split index.

[1] In the branch leading to the merge commit v2.1.0-rc0~45 (Merge
    branch 'nd/split-index', 2014-07-16).

Signed-off-by: SZEDER Gábor <szeder.dev@xxxxxxxxx>
---
 t/t1701-racy-split-index.sh | 164 ++++++++++++++++++++++++++++++++++++
 1 file changed, 164 insertions(+)
 create mode 100755 t/t1701-racy-split-index.sh

diff --git a/t/t1701-racy-split-index.sh b/t/t1701-racy-split-index.sh
new file mode 100755
index 0000000000..c92ba9ce5e
--- /dev/null
+++ b/t/t1701-racy-split-index.sh
@@ -0,0 +1,164 @@
+#!/bin/sh
+
+# This test can give false success if your machine is sufficiently
+# slow or all trials happened to happen on second boundaries.
+
+test_description='racy split index'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	# Only split the index when the test explicitly says so.
+	sane_unset GIT_TEST_SPLIT_INDEX GIT_FSMONITOR_TEST &&
+	git config splitIndex.maxPercentChange 100 &&
+
+	echo something >other-file &&
+	# No raciness with this file.
+	test-tool chmtime =-20 other-file
+'
+
+check_cached_diff () {
+	git diff-index --patch --cached $EMPTY_TREE racy-file >diff &&
+	tail -1 diff >actual &&
+	echo "+cached content" >expect &&
+	test_cmp expect actual
+}
+
+for trial in 0 1 2 3 4
+do
+	test_expect_success "split the index when the worktree contains a racily clean file #$trial" '
+		test_when_finished "rm -f .git/index .git/sharedindex.*" &&
+
+		# The next three commands must be run within the same
+		# second (so both writes to racy-file result in the same
+		# mtime) to create the interesting racy situation.
+		echo "cached content" >racy-file &&
+
+		# Update and split the index.  The cache entry of
+		# racy-file will be stored in the shared index.
+		git update-index --split-index --add racy-file &&
+
+		# File size must stay the same.
+		echo "dirty worktree" >racy-file &&
+
+		check_cached_diff
+	'
+done
+
+for trial in 0 1 2 3 4
+do
+	test_expect_success "split the index when it contains a racily clean cache entry #$trial" '
+		test_when_finished "rm -f .git/index .git/sharedindex.*" &&
+
+		# The next three commands must be run within the same
+		# second.
+		echo "cached content" >racy-file &&
+
+		git update-index --add racy-file &&
+
+		# File size must stay the same.
+		echo "dirty worktree" >racy-file &&
+
+		# Now wait a bit to ensure that the split index written
+		# below will get a more recent mtime than racy-file,
+		# and, consequently, subsequent git commands wont
+		# consider the entry racily clean.
+		sleep 1 &&
+
+		# Update and split the index when it contains the
+		# racily clean cache entry of racy-file; the stat data
+		# in that entry should be smudged, so the file wont
+		# appear clean for subsequent git commands.
+		git update-index --split-index --add other-file &&
+
+		check_cached_diff
+	'
+done
+
+for trial in 0 1 2 3 4
+do
+	test_expect_failure "update the split index when the shared index contains a racily clean cache entry #$trial" '
+		test_when_finished "rm -f .git/index .git/sharedindex.*" &&
+
+		# The next three commands must be run within the same
+		# second.
+		echo "cached content" >racy-file &&
+
+		# Update and split the index.  The cache entry of
+		# racy-file will be stored in the shared index.
+		git update-index --split-index --add racy-file &&
+
+		# File size must stay the same.
+		echo "dirty worktree" >racy-file &&
+
+		# Now wait a bit to ensure that the split index written
+		# below will get a more recent mtime than racy-file.
+		sleep 1 &&
+
+		# Update the split index when the shared index contains
+		# the racily clean cache entry of racy-file.  A
+		# corresponding replacement cache entry with smudged
+		# stat data should be added to the new split index, so
+		# the file wont appear clean for subsequent git commands.
+		#
+		# Alas, such a smudged replacement entry is not added!
+		git update-index --add other-file &&
+
+		check_cached_diff
+	'
+done
+
+for trial in 0 1 2 3 4
+do
+	test_expect_success "add a racily clean file to an already split index #$trial" '
+		test_when_finished "rm -f .git/index .git/sharedindex.*" &&
+
+		git update-index --split-index &&
+
+		# The next three commands must be run within the same
+		# second.
+		echo "cached content" >racy-file &&
+
+		# Update the split index.  The cache entry of racy-file
+		# will be stored in the split index.
+		git update-index --add racy-file &&
+
+		# File size must stay the same.
+		echo "dirty worktree" >racy-file &&
+
+		check_cached_diff
+	'
+done
+
+for trial in 0 1 2 3 4
+do
+	test_expect_success "update the split index when it contains a racily clean cache entry #$trial" '
+		test_when_finished "rm -f .git/index .git/sharedindex.*" &&
+
+		git update-index --split-index &&
+
+		# The next three commands must be run within the same
+		# second.
+		echo "cached content" >racy-file &&
+
+		# Update the split index.  The cache entry of racy-file
+		# will be stored in the split index.
+		git update-index --add racy-file &&
+
+		# File size must stay the same.
+		echo "dirty worktree" >racy-file &&
+
+		# Now wait a bit to ensure that the split index written
+		# below will get a more recent mtime than racy-file.
+		sleep 1 &&
+
+		# Update the split index when it contains the racily
+		# clean cache entry of racy-file; the stat data in that
+		# entry should be smudged.
+		git update-index --add other-file &&
+
+		check_cached_diff
+	'
+done
+
+test_done
-- 
2.19.0.rc0.188.g56c5ee2db1