Web lists-archives.com

Re: [PATCH 2/5] gc: convert to using the_hash_algo




On Thu, Mar 14, 2019 at 12:54:36AM +0100, Ævar Arnfjörð Bjarmason wrote:

> There's been a lot of changing of the hardcoded "40" values to
> the_hash_algo->hexsz, but we've so far missed this one where we
> hardcoded 38 for the loose object file length.
> 
> This is because a SHA-1 like abcde[...] gets turned into
> objects/ab/cde[...]. There's no reason to suppose the same won't be
> the case for SHA-256, and reading between the lines in
> hash-function-transition.txt the format is planned to be the same.

Yep, makes sense.

> However, we may want to modify this code for the hash function
> transition. There's a potential pathological case here where we'll
> only consider the loose objects for the currently active hash, but
> objects for that hash will share a directory storage with the other
> hash.
> 
> Thus we could theoretically have 1k SHA-1 loose objects, and say 1
> million SHA-256 objects, and not notice because we're currently using
> SHA-1.

I agree that we may end up needing to touch this, but I think this patch
doesn't make anything worse in that respect (and likely makes it better,
since we at least know this "38" is supposed to be a hash).

> diff --git a/builtin/gc.c b/builtin/gc.c
> index 8c2312681c..9c2c63276d 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -156,6 +156,8 @@ static int too_many_loose_objects(void)
>  	int auto_threshold;
>  	int num_loose = 0;
>  	int needed = 0;
> +	const unsigned hexsz = the_hash_algo->hexsz;
> +	const unsigned hexsz_loose = hexsz - 2;

It doesn't look like hexsz gets used anywhere else; is it worth having
the extra variable? (Admittedly this quite a nit).

-Peff