Web lists-archives.com

Re: [RFC PATCH] index-pack: improve performance on NFS




"Jansen, Geert" <gerardu@xxxxxxxxxx> writes:

> The index-pack command determines if a sha1 collision test is needed by
> checking the existence of a loose object with the given hash. In my tests, I
> can improve performance of “git clone” on Amazon EFS by 8x when used with a
> non-default mount option (lookupcache=pos) that's required for a Gitlab HA
> setup. My assumption is that this check is unnecessary when cloning into a new
> repository because the repository will be empty.

My knee-jerk reaction is that your insight that we can skip the "dup
check" when starting from emptiness is probably correct, but your
use of .cloning flag as an approximation of "are we starting from
emptiness?" is probably wrong, at least for two reasons.

 - "git clone --reference=..." does not strictly start from
   emptiness, and would still have to make sure that incoming pack
   does not try to inject an object with different contents but with
   the same name.

 - "git init && git fetch ..." starts from emptiness and would want
   to benefit from the same optimization as you are implementing
   here.

As to the implementation, I think the patch adjusts the right "if()"
condition to skip the collision test here.

> -	if (startup_info->have_repository) {
> +	if (startup_info->have_repository && !cloning) {
>  		read_lock();
>  		collision_test_needed =
>  			has_sha1_file_with_flags(oid->hash, OBJECT_INFO_QUICK);

I just do not think !cloning is quite correct.

Thanks.