Web lists-archives.com

Re: [PATCH v3] unpack-trees: avoid duplicate ODB lookups during checkout




git@xxxxxxxxxxxxxxxxx writes:

> +	/*
> +	 * Fetch the tree from the ODB for each peer directory in the
> +	 * n commits.
> +	 *
> +	 * For 2- and 3-way traversals, we try to avoid hitting the
> +	 * ODB twice for the same OID.  This should yield a nice speed
> +	 * up in checkouts and merges when the commits are similar.
> +	 *
> +	 * We don't bother doing the full O(n^2) search for larger n,
> +	 * because wider traversals don't happen that often and we
> +	 * avoid the search setup.
> +	 *
> +	 * When 2 peer OIDs are the same, we just copy the tree
> +	 * descriptor data.  This implicitly borrows the buffer
> +	 * data from the earlier cell.

This "borrowing" made me worried, but it turns out that this is
perfectly OK.

fill_tree_descriptor() uses read_sha1_file() to give a tree_desc its
own copy of the tree object data, so the code that calls into the
tree traversal API is responsible for freeing the buffer returned
from fill_tree_descriptor().  The updated code avoids double freeing
by setting buf[i] to NULL for borrowing [i].

> +	 */
>  	for (i = 0; i < n; i++, dirmask >>= 1) {
> -		const unsigned char *sha1 = NULL;
> -		if (dirmask & 1)
> -			sha1 = names[i].oid->hash;
> -		buf[i] = fill_tree_descriptor(t+i, sha1);
> +		if (i > 0 && are_same_oid(&names[i], &names[i - 1])) {
> +			t[i] = t[i - 1];
> +			buf[i] = NULL;
> +		} else if (i > 1 && are_same_oid(&names[i], &names[i - 2])) {
> +			t[i] = t[i - 2];
> +			buf[i] = NULL;
> +		} else {
> +			const unsigned char *sha1 = NULL;
> +			if (dirmask & 1)
> +				sha1 = names[i].oid->hash;
> +			buf[i] = fill_tree_descriptor(t+i, sha1);
> +		}
>  	}
>  
>  	bottom = switch_cache_bottom(&newinfo);