Web lists-archives.com

Re: [PATCH v1] dir.c: don't flag the index as dirty for changes to the untracked cache

On 2/5/2018 4:58 PM, Brandon Williams wrote:
On 02/05, Ben Peart wrote:
The untracked cache saves its current state in the UNTR index extension.
Currently, _any_ change to that state causes the index to be flagged as dirty
and written out to disk.  Unfortunately, the cost to write out the index can
exceed the savings gained by using the untracked cache.  Since it is a cache
that can be updated from the current state of the working directory, there is
no functional requirement that the index be written out for every change to the
untracked cache.

Update the untracked cache logic so that it no longer forces the index to be
written to disk except in the case where the extension is being turned on or
off.  When some other git command requires the index to be written to disk, the
untracked cache will take advantage of that to save it's updated state as well.
This results in a performance win when looked at over common sequences of git
commands (ie such as a status followed by add, commit, etc).

After this patch, all the logic to track statistics for the untracked cache
could be removed as it is only used by debug tracing used to debug the untracked

So we don't need to update it every time because its just a cache
and if its inaccurate between status calls that's ok?  So only
operations like add and commit will actually write out the untracked
cache (as a part of writing out the index).  Sounds ok.

What benefit is there to using the untracked cache then?  Sounds like
you should just turn it off instead?
(I'm sure this is a naive question :D )

The parts of the untracked cache that have not changed since the extension was updated are still cached and valid. Only those directories that have changes will need to be checked.

With the old behavior, making a change in dir1/, then calling status would update the dir1/ untracked cache entry, flag the index as dirty and write it out. On the next status, git would detect that no changes have been made and use the cached data for dir1/.

With the new behavior, making a change in dir1/, then calling status would update the dir1/ untracked cache entry but not write it out. On the next status, git would detect the change in dir1/ again and update the untracked cache. All of the other cached entries are still valid and the cache would be used for them. The updated cache entry for dir1/ would not get persisted to disk until something that required the index to be written out.

The behavior is correct in both cases. You just don't get the benefit of the updated cache for the dir1/ entry until the index is persisted again. What you gain in exchange is that you don't have to write out the index which is (typically) a lot more expensive than checking dir1/ for changes.

Signed-off-by: Ben Peart <benpeart@xxxxxxxxxxxxx>

     Base Ref: master
     Web-Diff: https://github.com/benpeart/git/commit/20c2e8d787
     Checkout: git fetch https://github.com/benpeart/git untracked-cache-v1 && git checkout 20c2e8d787

  dir.c                             | 3 ++-
  t/t7063-status-untracked-cache.sh | 3 +++
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 7c4b45e30e..da93374f0c 100644
--- a/dir.c
+++ b/dir.c
@@ -2297,7 +2297,8 @@ int read_directory(struct dir_struct *dir, struct index_state *istate,
-		if (dir->untracked == istate->untracked &&
+		if (getenv("GIT_TEST_UNTRACKED_CACHE") &&
+			dir->untracked == istate->untracked &&
  		    (dir->untracked->dir_opened ||
  		     dir->untracked->gitignore_invalidated ||
diff --git a/t/t7063-status-untracked-cache.sh b/t/t7063-status-untracked-cache.sh
index e5fb892f95..6ef520e823 100755
--- a/t/t7063-status-untracked-cache.sh
+++ b/t/t7063-status-untracked-cache.sh
@@ -14,6 +14,9 @@ test_description='test untracked cache'
  # See <20160803174522.5571-1-pclouds@xxxxxxxxx> if you want to know
  # more.
  sync_mtime () {
  	find . -type d -ls >/dev/null

base-commit: 5be1f00a9a701532232f57958efab4be8c959a29