Web lists-archives.com

Re: [PATCH 06/34] clone: release strbuf after use in remove_junk()




Am 10.09.2017 um 09:30 schrieb Jeff King:
> On Sun, Sep 10, 2017 at 08:27:40AM +0200, René Scharfe wrote:
> 
>>>>    	if (junk_work_tree) {
>>>>    		strbuf_addstr(&sb, junk_work_tree);
>>>>    		remove_dir_recursively(&sb, 0);
>>>> -		strbuf_reset(&sb);
>>>>    	}
>>>> +	strbuf_release(&sb);
>>>>    }
>>>
>>> The code definitely needs a _release() at the end, but I feel
>>> lukewarm about the "if we are about to _release(), do not bother to
>>> _reset()" micro-optimization.  Keeping the existing two users that
>>> use sb as a (shared and reused) temporary similar would help those
>>> who add the third one or reuse the pattern in their code elsewhere.
>>
>> That's not intended as an optimization, but as a promotion -- the reset
>> is moved to the outer block and upgraded to a release.  The result is
>> consistent with builtin/worktree.c::remove_junk().
> 
> Hmm. This is a cleanup function called only from signal and atexit
> handlers. I don't think we actually do need to clean up, and this might
> be a good candidate for UNLEAK().
> 
> And in fact, being called from a signal handler means we should
> generally avoid touching malloc or free (which could be holding locks).
> That would mean preferring a leak to strbuf_release(). Of course that is
> the tip of the iceberg. We call strbuf_addstr() here, and
> remove_dir_recursively() will grow our buffer.

And we call opendir(3), readdir(3), and closedir(3), which are also not
listed as async-safe functions by POSIX [1].

> So I actually wonder if junk_git_dir and junk_work_tree should be
> pre-sized strbufs themselves. And that makes the leak "go away" in the
> eyes of leak-checkers because we hold onto the static strbufs until
> program exit.

We could start with a small buffer and expand it to match the path
length of written files as we go.

Deletion without readdir(3) requires us to keep a list of all written
files and directories, though.  Perhaps in the form of an append-only
log; the signal handler could then go through them in reverse order
and remove them.  Or is there a better way?

René


[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03_03