Web lists-archives.com

Re: I'm done with O_CLOEXEC

So,  you found that dup3 doesn't do what you want, and now you want to
throw out the baby with the bathwater and just say "I don't care
anymore if we leak fds" ?

On Fri, Mar 20, 2015 at 4:43 PM, Ryan Lortie <desrt@xxxxxxxx> wrote:
> karaj,
> For those unfamiliar with the issue: when a process is created on UNIX
> via naive fork() and exec(), the default is that the process will
> inherit all of the file descriptors of the parent.  This makes a lot of
> sense for stdin, stdout and stderr, but almost nothing else.
> This has been the cause of a lot of strange problems over the years.
> The typical example: a process will open a listener socket at some point
> and sometime later will call a library function that does a naive
> fork()/exec() of a helper process that hangs around past the lifetime of
> the original program.  When you try to restart the first program, the
> socket is still being held open by the helper and the new instance can't
> bind it again.
> There are two fixes to this problem.
> The first one, which we have been pursuing during the past several
> years, is to try to mark every file descriptor that we create as
> O_CLOEXEC.  This is particularly fun in multi-threaded programs because
> it means that we have a race between the creation of a file descriptor
> and marking it O_CLOEXEC vs. a fork() that may be happening in another
> thread.  This has led to the creation of a whole bunch of new syscalls
> to allow creation of file descriptors that already have O_CLOEXEC set
> from the start, thus avoiding the race.  We have tried to use these
> syscalls where possible, but they usually are not part of POSIX.
> Somethings they are completely unavailable, even in Linux, or when they
> are available, they have other annoying limitations.
> The other fix to the problem is one that we have had in place for a long
> time in the g_spawn_*() family of APIs, and also in the new GSubprocess
> API.  The trick involves close()ing all fds (except stdin/out/err) each
> time we do a fork()/exec() pair.
> Assuming it is practised universally, only one of these fixes is
> necessary.
> Today I am suggesting that we completely abandon our attempts to follow
> the first approach.  I'm done with O_CLOEXEC.
> What led me to this was the dup3() system call.  This is a variant of
> dup2() that was added (as part of the efforts mentioned above) to avoid
> the O_CLOEXEC race:
>   int dup3(int oldfd, int newfd, int flags);
> unfortunately:
>   dup3(0, -1, 0)                          = -1 EBADF (Bad file
>   descriptor)
> which means that using this as a stand-in for dup() is a no-go.  I could
> probably work around that by creating a new eventfd() or unbound UNIX
> socket in order to get a new fd number (while being careful to mark it
> as O_CLOEXEC as well) before using dup3().  We could probably also get
> this fixed in Linux, but dup3() has already been widely copied and we
> would then have to go about detecting which implementations are working
> and which aren't, and include a fallback (which would have to be
> implemented using the same dirty hacks mentioned above).  I've had
> enough with these games, and this isn't really about dup3() anyway.
> O_CLOEXEC is useless.
> Okay.  O_CLOEXEC is useful for one thing: when spawning a new process
> using fork()/exec(), you may want to know if exec() worked.  An old
> trick for this is to create a pipe and mark the writer end O_CLOEXEC.
> The reader end will read EOF (due to the close of the writer) once
> exec() has succeeded.  Otherwise, you can indicate the error by sending
> some other data through the pipe and calling exit().
> Aside from that, O_CLOEXEC is useless.
> So: starting today I'm going to stop worrying about O_CLOEXEC being set
> on every file descriptor that GLib creates.  I'm not going to go and
> retroactively tear things out where they are already working, unless it
> would provide a substantial cleanup or fixes an actual bug.  I'm not
> just going to go around looking for #ifdefs to remove.
> I believe this is justified for a few reasons:
>  - during the GSubprocess discussion, I originally held the opposite
>  opinion, but eventually became convinced (by Colin) to see the
>  inherit-by-default behaviour of exec() as nothing more than a
>  questionable implementation detail of the underlying OS.  Consequently,
>  at the high level, GSubprocess provides an API that gives the caller
>  direct control over what is inherited and what is not, and that's just
>  the way that it should be.
>  - this behaviour is not limited to GSubprocess.  Closing all fds before
>  calling exec() is a common practice in modern libraries and runtimes,
>  and for good reason.
>  - fixing the few places that we spawn other programs is massively
>  preferable to fixing the hundreds or thousands of places that we create
>  new file descriptors
>  - in the world of D-Bus activation, direct spawning of long-lived
>  helper processes is just not something that we do anymore anyway.  fds
>  are not the only thing we have to worry about here.  How about which
>  cgroup the helper ends up in?
>  - it's 2015 and I'm wondering why system() hasn't been deprecated
>  already
> Unfortunately, the drawback: it will no longer be safe, in general, to
> use GLib with broken old libraries that don't close() fds before exec().
> I'm sure others will have an opinion about this.  Did I miss something
> important in my argument?
> Cheers
> _______________________________________________
> gtk-devel-list mailing list
> gtk-devel-list@xxxxxxxxx
> https://mail.gnome.org/mailman/listinfo/gtk-devel-list
gtk-devel-list mailing list