Web lists-archives.com

Re: [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames




On Sat, Aug 26, 2017 at 12:16 PM, Martin Ågren <martin.agren@xxxxxxxxx> wrote:
> On 25 August 2017 at 23:00, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>> Martin Ågren <martin.agren@xxxxxxxxx> writes:
>>
>>> files_transaction_prepare() and the functions it calls add strings to a
>>> string list without duplicating them, i.e., we keep the original raw
>>> pointers we were given. That is "ok", since we keep them only for a
>>> short-enough time, but we end up leaking some of them.
>>
>> [...]

Good find, Martin.

First of all, you are right that we don't want any memory leaks here.
Nobody promises that the program will end soon if a reference update
fails. (In fact, there are invocations of `git` that trigger multiple
reference updates.) This is a small leak, but we should fix it.

The problem (if I may take a stab at explaining it in my own words) is
that `files_transaction_prepare()` uses a `string_list` called
`affected_refnames` to ensure that the same reference name is not
modified more than once in a single reference transaction. Currently,
`affected_refnames` is configured *not* to duplicate the refnames that
are fed to it, which also means that it doesn't free the refnames when
it is cleared.

This is correct for most refnames, which are owned by entries in the
`ref_transaction`, and therefore have a longer lifetime than
`affected_refnames`.

But there is one code path that can add a refname to
`affected_refnames` without making a provision for the refname ever to
be freed:

* files_transaction_prepare()
  * lock_ref_for_update()
    * split_symref_update()
      * item = string_list_insert(affected_refnames, referent)

The `referent` in the last statement comes from a `strbuf` that is
created in `lock_ref_for_update()` then passed to `lock_raw_ref()`,
which fills it.

It would be a serious bug if `lock_ref_for_update()` would dispose of
`referent`, because the pointer in `affected_refnames` would point at
freed memory. But in fact we have the opposite problem;
`lock_ref_for_update()` never frees the memory (it doesn't even
`strbuf_detach()` it). So that memory is leaked.

Your proposed solution is to change `affected_refnames` to duplicate
the strings that are stored in it, in which case
`lock_ref_for_update()` can properly dispose of `referent`, fixing the
leak. This works, but at the price of having to allocate memory for
*all* references in the update, even though most of them are already
fine.

But note that `split_symref_update()` *also* passes a copy of
`referent` to `ref_transaction_add_update()`, which *already* makes a
copy of the reference name and adds an entry containing the name to
the `ref_transaction`. If we would store *that* copy to
`affected_refnames`, then it would get freed when the
`ref_transaction` is cleaned up, and we could fix the memory leak
without allocating any new memory. This requires a little reorg of
`split_symref_update()` but it's not too bad:

* Do the initial check using `string_list_has_string()` rather than
calling `string_list_insert()` already.
* After `new_update` has been created, call `string_list_insert()`,
passing it `new_update->refname` as the string.

If this is done in place of your first commit, then your second commit
could be left unchanged.

Michael