Web lists-archives.com

Re: [PATCH v10 3/3] read-cache: speed up add_index_entry during checkout






On 4/17/2017 10:53 AM, Jeff Hostetler wrote:


On 4/15/2017 1:55 PM, René Scharfe wrote:
Am 14.04.2017 um 21:12 schrieb git@xxxxxxxxxxxxxxxxx:
From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>

Very nice, especially the perf test!  But can we unbundle the different
optimizations into separate patches with their own performance numbers?
Candidates IMHO: The change in add_index_entry_with_check(), the first
hunk in has_dir_name(), the loop shortcuts.  That might also help find
the reason for the slight slowdown of 0006.3 for the kernel repository.

Let me take a look at this and see if it helps.

Last night I pushed up version 11 which has the 3 parts
of read-cache.c in 3 commits (but still in the same patch
series).  This should allow for more experimentation.

The add_index_entry_with_check() shows a gain.  For the
operations in p0006 on linux.git, the short-cut was being
taken 57993 of 57994 times.

The top of has_dir_name() -- by itself -- does not, but
the short-cut only triggers when the paths have no
prefix in common -- which only happens when the top-level
directory changes.  On linux.git, this was 19 of 57993.
However, it does set us up for the next part.

The 3 loop body short-cuts hit 54372, 3509, and 86 (sum
57967) times.  So in p0006, the search was only attempted
7 times (57993 - 19 - 57967) most of the time.


WRT the slowdown of 0006.3 on linux.git, I suspect this is
I/O noise.  In the commit message for part 2 in V11, I
show 2 runs on linux.git that show wide variance in the 0006.3
times.  And given the nature of that test, the speed up in the
lookups is completely hidden by the I/O of the full checkouts.
When I step up to a repo with 4M files, the results are very
clear.

https://public-inbox.org/git/20170417213734.55373-6-git@xxxxxxxxxxxxxxxxx/