Web lists-archives.com

Re: [PATCH 06/34] clone: release strbuf after use in remove_junk()




Am 10.09.2017 um 19:38 schrieb Jeff King:
> On Sun, Sep 10, 2017 at 12:37:06PM +0200, René Scharfe wrote:
> 
>>> And in fact, being called from a signal handler means we should
>>> generally avoid touching malloc or free (which could be holding locks).
>>> That would mean preferring a leak to strbuf_release(). Of course that is
>>> the tip of the iceberg. We call strbuf_addstr() here, and
>>> remove_dir_recursively() will grow our buffer.
>>
>> And we call opendir(3), readdir(3), and closedir(3), which are also not
>> listed as async-safe functions by POSIX [1].
> 
> Good point. I don't know how dangerous those are in the real-world
> (i.e., is POSIX leaving an out for some implementations, or are they
> really unsafe on common platforms like Linux).

No idea.

>>> So I actually wonder if junk_git_dir and junk_work_tree should be
>>> pre-sized strbufs themselves. And that makes the leak "go away" in the
>>> eyes of leak-checkers because we hold onto the static strbufs until
>>> program exit.
>>
>> We could start with a small buffer and expand it to match the path
>> length of written files as we go.
> 
> Yes, but I didn't want to touch each code site that creates a file,
> which is a lot more invasive. I expect expanding to 4096 (or PATH_MAX)
> would be sufficient in practice.

Perhaps it is in most cases.  Having certainty would be better.  We can
leave unpack_trees() untouched and instead traverse the tree beforehand,
in order to find the longest path.  Perhaps we can do something similar
for init_db().

> I'd also note that the current code isn't _remotely_ async safe and
> nobody's complained. So I'm perfectly happy doing nothing, too. I care
> a bit more about the tempfile.c interface because it's invoked a lot
> more frequently.

I guess clones are not interrupted very often during checkout; same
with worktree creation.  So perhaps nasty things could happen with a
low probability, but nobody tried hard enough to hit that case?

>> Deletion without readdir(3) requires us to keep a list of all written
>> files and directories, though.  Perhaps in the form of an append-only
>> log; the signal handler could then go through them in reverse order
>> and remove them.  Or is there a better way?
> 
> No, I think that's how you'd have to do it. The implementation isn't all
> that bad, but hitting every file-creation would be invasive. I'd
> almost rather bail to exec-ing "rm -rf $dir", but we probably can't do
> _that_ in a signal-handler either. :)

Well, fork(2), execve(2), and waitpid(2) are on the list, so actually
you can.

mingw_spawnvpe(), which is used by start_command() on Windows,
allocates memory, though.  Preparing environment and argument list
in advance and just calling CreateProcess() in the signal handler
might work.

René