Web lists-archives.com

Re: js/partial-clone-connectivity-check (was: What's cooking in git.git (Apr 2019, #05; Thu, 25))

On Wed, May 01, 2019 at 10:52:06PM -0400, Jeff King wrote:

> > > * js/partial-clone-connectivity-check (2019-04-21) 1 commit
> > >   (merged to 'next' on 2019-04-25 at ebd8b4bffd)
> > >  + clone: do faster object check for partial clones
> > > 
> > >  During an initial "git clone --depth=..." partial clone, it is
> > >  pointless to spend cycles for a large portion of the connectivity
> > >  check that enumerates and skips promisor objects (which by
> > >  definition is all objects fetched from the other side).  This has
> > >  been optimized out.
> > > 
> > >  Will merge to 'master'.
> > 
> > Peff asked for a perf test for this [1], but I haven't had time to write one
> > yet. I can do that in a separate patch if you still want to merge this
> > as-is.
> I won't die without one, but it would be nice. It may also be that an
> existing perf test, but I don't think we cover partial clones in t/perf
> at all. Might be worth just a straight-up "git clone --filter=blob:none"
> test.

Here's what I came up with. Note that there's a bug in 'master' right
now which causes perf to produce nonsense results. It's due to my
0baf78e7bc (perf-lib.sh: rely on test-lib.sh for --tee handling,
2019-03-15). I'll fix that separately (the timing below is done with
that commit reverted).

-- >8 --
Subject: [PATCH] t/perf: add perf script for partial clones

We don't cover the partial clone feature at all in t/perf. Let's at
least run a few basic tests so that we'll notice any regressions.

We'll do a no-blob clone, and split it into two parts: the actual object
transfer, and the subsequent checkout (which will of course require
another transfer to get the blobs). That will help us more clearly
assess the performance of each.

There are obviously a lot more possibilities besides just a no-blob
partial clone, but this should serve as a canary that alerts us to any
generic slow-downs (and we can add more tests later for cases that
aren't exercised here).

There are a few non-ideal things here that make this not an entirely
accurate test, but are probably OK for our purposes:

  1. We have to do some extra prep/cleanup work inside the timing tests,
     since they impact the on-disk state and the perf harness may run
     each one multiple times.

     In practice this is probably OK, since these bits should be much
     less expensive than the operations we are measuring.

  2. The clone time is likely to be dominated by the server's object
     enumeration. In the real world, a repo large enough to drive people
     to partial clones is likely to have reachability bitmaps enabled.

     And in the opposite direction, our object transfer is happening at
     the speed of a local pipe, whereas in the real world it would
     bottle-neck on the network.

     So any percentage speedups should be taken with a grain of salt.
     But hopefully any regressions will produce enough of an effect to
     be noticeable.

This script also demonstrates the recent improvement from dfa33a298d
(clone: do faster object check for partial clones, 2019-04-19):

  Test                          dfa33a298d^         dfa33a298d
  5600.2: clone without blobs   18.41(22.72+1.09)   6.83(11.65+0.50) -62.9%
  5600.3: checkout of result    1.82(3.24+0.26)     1.84(3.24+0.26) +1.1%

Signed-off-by: Jeff King <peff@xxxxxxxx>
The speedup from dfa33a298d was larger than I expected. Doing a full
`rev-list --objects --all` on a non-partial clone is only 3-4s. So how
did we manage to save 11s by removing it? The answer is that the
is_promisor check is super expensive: we have to walk all of those trees
in the partial pack to find which blobs are mentioned, only to then walk
them all again for the actual traversal and say "yep, this is in our
promisor list". Yikes.

Your patch makes all of that go away, but I do think there's room for us
to be more clever here (but that leads us back down the rabbit hole you
explored earlier, so I think it makes sense to take your patch now and
deal with other optimizations separately).

 t/perf/p5600-partial-clone.sh | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100755 t/perf/p5600-partial-clone.sh

diff --git a/t/perf/p5600-partial-clone.sh b/t/perf/p5600-partial-clone.sh
new file mode 100755
index 0000000000..3e04bd2ae1
--- /dev/null
+++ b/t/perf/p5600-partial-clone.sh
@@ -0,0 +1,26 @@
+test_description='performance of partial clones'
+. ./perf-lib.sh
+test_expect_success 'enable server-side config' '
+	git config uploadpack.allowFilter true &&
+	git config uploadpack.allowAnySHA1InWant true
+test_perf 'clone without blobs' '
+	rm -rf bare.git &&
+	git clone --no-local --bare --filter=blob:none . bare.git
+test_perf 'checkout of result' '
+	rm -rf worktree &&
+	mkdir -p worktree/.git &&
+	tar -C bare.git -cf - . | tar -C worktree/.git -xf - &&
+	git -C worktree config core.bare false &&
+	git -C worktree checkout -f