Re: [PATCH] refs: make sure we never pass NULL to hashcpy

On Wed, Sep 6, 2017 at 3:26 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Thomas Gummerer <t.gummerer@xxxxxxxxx> writes:
>> gcc on arch linux (version 7.1.1) warns that a NULL argument is passed
>> as the second parameter of memcpy.
>> [...]
> It is hugely annoying to see a halfway-intelligent compiler forces
> you to add such pointless asserts.
> The only way the compiler could error on this is by inferring the
> fact that new_sha1/old_sha1 could be NULL by looking at the callsite
> in ref_transaction_update() where these are used as conditionals to
> set HAVE_NEW/HAVE_OLD that are passed.  Even if the compiler were
> doing the whole-program analysis, the other two callsites of the
> function pass the address of oid.hash[] in an oid structure so it
> should know these won't be NULL.
> [...]
> I wonder if REF_HAVE_NEW/REF_HAVE_OLD are really needed in these
> codepaths, though.  Perhaps we can instead declare !!new_sha1 means
> we have the new side and rewrite the above part to
>         if (new_sha1)
>                 hashcpy(update->new_oid.hash, new_sha1);
> without an extra and totally pointless assert()?

The ultimate reason for those flags is that `struct ref_update` embeds
`new_oid` and `old_oid` directly in the struct, so there is no way to
set it to "NULL". (The `is_null_sha1` value is used for a different
purpose.) So those flags keep track of whether the corresponding value
is specified or absent.

Four of the five callers of `ref_transaction_add_update()` are
constructing a new `ref_update` from an old one. They currently don't
have to look into `flags`; they just pass it on (possibly changing a
bit or two). Implementing your proposal would oblige those callers to
change from something like

> new_update = ref_transaction_add_update(
>         transaction, "HEAD",
>         update->flags | REF_LOG_ONLY | REF_NODEREF,
>         update->new_oid.hash, update->old_oid.hash,
>         update->msg);


> new_update = ref_transaction_add_update(
>         transaction, "HEAD",
>         update->flags | REF_LOG_ONLY | REF_NODEREF,
>         (update->flags & REF_HAVE_NEW) ? update->new_oid.hash : NULL,
>         (update->flags & REF_HAVE_OLD) ? update->old_oid.hash : NULL,
>         update->msg);

It's not the end of the world, but it's annoying.
`ref_transaction_add_update()` was meant to be a low-level,
low-overhead way of allocating a `struct ref_update` and add it to a

Another solution (also annoying, but maybe a tad less so) would be to
change the one iffy caller, `ref_transaction_update()`, to pass in a
pointer to the null OID for `new_sha1` and `old_sha1` when the
corresponding flags are turned off. That value would never be looked
at, but it would hopefully reassure gcc.

I did just realize one thing: `ref_transaction_update()` takes `flags`
as an argument and alters it using

>         flags |= (new_sha1 ? REF_HAVE_NEW : 0) | (old_sha1 ? REF_HAVE_OLD : 0);

Perhaps gcc is *more* intelligent than we give it credit for, and is
actually worried that the `flags` argument passed in by the caller
might *already* have one of these bits set. In that case
`ref_transaction_add_update()` would indeed be called incorrectly.
Does the warning go away if you change that line to

>         if (new_sha1)
>                 flags |=REF_HAVE_NEW;
>         else
>                 flags &= ~REF_HAVE_NEW;
>         if (old_sha1)
>                 flags |=REF_HAVE_OLD;
>         else
>                 flags &= ~REF_HAVE_OLD;

? This might be a nice change to have anyway, to isolate
`ref_transaction_update()` from mistakes by its callers. For that
matter, one might want to be even more selective about what bits are
allowed in the `flags` argument to `ref_transaction_update()`'s

>         flags &= REF_ALLOWED_FLAGS; /* value would need to be determined */