Web lists-archives.com

Re: [PATCH 00/12] Clean up notes-related code around `load_subtree()`




On 09/09/2017 12:31 PM, Jeff King wrote:
> On Sat, Aug 26, 2017 at 10:28:00AM +0200, Michael Haggerty wrote:
> 
>> It turns out that the comment is incorrect, but there was nevertheless
>> plenty that could be cleaned up in the area:
>>
>> * Make macro `GIT_NIBBLE` safer by adding some parentheses
>> * Remove some dead code
>> * Fix some memory leaks
>> * Fix some obsolete and incorrect comments
>> * Reject "notes" that are not blobs
>>
>> I hope the result is also easier to understand.
>>
>> This branch is also available from my Git fork [1] as branch
>> `load-subtree-cleanup`.
> 
> FYI, Coverity seems to complain about "pu" after this series is merged, but
> I think it's wrong.  It says:
> 
>   *** CID 1417630:  Memory - illegal accesses  (OVERRUN)
>   /notes.c: 458 in load_subtree()
>   452     
>   453     			/*
>   454     			 * Pad the rest of the SHA-1 with zeros,
>   455     			 * except for the last byte, where we write
>   456     			 * the length:
>   457     			 */
>   >>>     CID 1417630:  Memory - illegal accesses  (OVERRUN)
>   >>>     Overrunning array of 20 bytes at byte offset 20 by dereferencing pointer "&object_oid.hash[len]".
>   458     			memset(object_oid.hash + len, 0, GIT_SHA1_RAWSZ - len - 1);
>   459     			object_oid.hash[KEY_INDEX] = (unsigned char)len;
>   460     
>   461     			type = PTR_TYPE_SUBTREE;
>   462     		} else {
>   463     			/* This can't be part of a note */
> 
> I agree that if "len" were 20 here that would be a problem, but I don't
> think that's possible.
> 
> The tool correctly claims that prefix_len can be up to 19, due to the
> assert:
> 
>      3. cond_at_most: Checking prefix_len >= 20UL implies that prefix_len may be up to 19 on the false branch.
>   420        if (prefix_len >= GIT_SHA1_RAWSZ)
>   421                BUG("prefix_len (%"PRIuMAX") is out of range", (uintmax_t)prefix_len);
> 
> Then it claims:
> 
>     13. Condition path_len == 2 * (20 - prefix_len), taking false branch.
>   430                if (path_len == 2 * (GIT_SHA1_RAWSZ - prefix_len)) {
>   431                        /* This is potentially the remainder of the SHA-1 */
> 
> So we know that either prefix_len is not 19, or that path_len is not 2
> (since that combination would cause us to take the true branch here).
> But then it goes on to say:
> 
>     14. Condition path_len == 2, taking true branch.
>   442                } else if (path_len == 2) {
>   443                        /* This is potentially an internal node */
> 
> which I believe must mean that prefix_len cannot be 19 here. And yet it
> says:
> 
>     15. assignment: Assigning: len = prefix_len. The value of len may now be up to 19.
>   444                        size_t len = prefix_len;
>   445
>   [...]
>      17. incr: Incrementing len. The value of len may now be up to 20.
>      18. Condition hex_to_bytes(&object_oid.hash[len++], entry.path, 1), taking false branch.
>   450                        if (hex_to_bytes(object_oid.hash + len++, entry.path, 1))
>   451                                goto handle_non_note; /* entry.path is not a SHA1 */
> 
> I think that's impossible, and Coverity simply isn't smart enough to
> shrink the set of possible values for prefix_len based on the set of
> if-else conditions.
> 
> So nothing to see here, but since I spent 20 minutes scratching my head
> (and I know others look at Coverity output and may scratch their heads
> too), I thought it was worth writing up. And also if I'm wrong, it would
> be good to know. ;)

Thanks for looking into this. I agree with your analysis.

I wonder whether it is the factor of two between path lengths and byte
lengths that is confusing Coverity. Perhaps the patch below would help.
It requires an extra, superfluous, check, but perhaps makes the code a
tad more readable. I'm neutral on whether we would want to make the change.

Is there a way to ask Coverity whether a hypothetical change would
remove the warning, short of merging the change to master?

Michael

diff --git a/notes.c b/notes.c
index 27d232f294..34f623f7b1 100644
--- a/notes.c
+++ b/notes.c
@@ -426,8 +426,14 @@ static void load_subtree(struct notes_tree *t,
struct leaf_node *subtree,
 		unsigned char type;
 		struct leaf_node *l;
 		size_t path_len = strlen(entry.path);
+		size_t path_bytes;

-		if (path_len == 2 * (GIT_SHA1_RAWSZ - prefix_len)) {
+		if (path_len % 2 != 0)
+			/* Path chunks must come in pairs of hex characters */
+			goto handle_non_note;
+
+		path_bytes = path_len / 2;
+		if (path_bytes == GIT_SHA1_RAWSZ - prefix_len) {
 			/* This is potentially the remainder of the SHA-1 */

 			if (!S_ISREG(entry.mode))
@@ -439,7 +445,7 @@ static void load_subtree(struct notes_tree *t,
struct leaf_node *subtree,
 				goto handle_non_note; /* entry.path is not a SHA1 */

 			type = PTR_TYPE_NOTE;
-		} else if (path_len == 2) {
+		} else if (path_bytes == 1) {
 			/* This is potentially an internal node */
 			size_t len = prefix_len;