Re: [PATCH] fetch-pack: binary search when storing wanted-refs
- Date: Wed, 27 Mar 2019 23:13:14 -0400
- From: Jeff King <peff@xxxxxxxx>
- Subject: Re: [PATCH] fetch-pack: binary search when storing wanted-refs
On Wed, Mar 27, 2019 at 02:11:10PM -0700, Jonathan Tan wrote:
> In do_fetch_pack_v2(), the "sought" array is sorted by name, and it is
> not subsequently reordered (within the function). Therefore,
> receive_wanted_refs() can assume that "sought" is sorted, and can thus
> use a binary search when storing wanted-refs retrieved from the server.
> Replace the existing linear search with a binary search. This improves
> performance significantly when mirror cloning a repository with more
> than 1 million refs.
This looks good.
The flow in do_fetch_pack_v2() is a little funny. I wanted to
double-check that we always sorted the sought list, because it only
happens in the CHECK_LOCAL state of the state machine.
But as far as I can tell we always begin the function in CHECK_LOCAL, it
always transitions out to another state, and we never go back to it. So
all of that part of the state-machine switch could really just be done
once before entering the state-machine's while loop.
Not really relevant to your patch, but maybe worth tweaking separately
(or maybe not, if people find the all-in-a-state-machine style more
readable; I found it more confusing to reason about).
> +static int cmp_name_ref(const void *name, const void *ref)
> + return strcmp(name, (*(struct ref **)ref)->name);
After some errors with qsort comparison functions a while back, I
double-checked that this has the right number of asterisks. I believe it
and the child thread it spawned are interesting reading, though I
don't think we ever followed up on it.