Re: Problem with zombie processes

On Mon, Feb 20, 2017 at 11:54 PM, Mark Geisert wrote:
> Erik Bray wrote:
>> On Mon, Feb 20, 2017 at 11:54 AM, Mark Geisert wrote:
>>>> So my guess was that Cygwin might try to hold on to a handle to a
>>>> child process at least until it's been explicitly wait()ed.  But that
>>>> does not seem to be the case after all.
>>> You might have missed a subtlety in what I said above.  The Python
>>> interpreter itself is calling wait4() to reap your child process.  Cygwin
>>> has told Python one of its children has died.  You won't get the chance
>>> to
>>> wait() for it yourself.  Cygwin *does* have a handle to the process, but
>>> it
>>> gets closed as part of Python calling wait4().
>> To be clear, wait4() is not called from Python until the script
>> explicitly calls p.wait().
>> In other words, when run this step by step (e.g. in gdb) I don't see a
>> wait4() call until the point where the script explicitly waits().  I
>> don't see any reason Python would do this behind the scenes.
> You're right.  I missed the wait in your script and ASSumed too much of the
> Python interpreter :-( .
>>>> Anyways, I think it would be nicer if /proc returned at least partial
>>>> information on zombie processes, rather than an error.  I have a patch
>>>> to this effect for /proc/<pid>/stat, and will add a few more as well.
>>>> To me /proc/<pid>/stat was the most important because that's the
>>>> easiest way to check the process's state in the first place!  Now I
>>>> also have to catch EINVAL as well and assume that means a zombie
>>>> process.
>>> The file /proc/<pid>/stat is there until Cygwin finishes cleanup of the
>>> child due to Python having wait()ed for it.  When you run your test
>>> script,
>>> pay attention to the process state character in those cases where you
>>> successfully read the stat file.  It's often S (stopped, I think) or R
>>> (running) but I also see Z (zombie) sometimes.  Your script is in a race
>>> with Cygwin, and you cannot guarantee you'll see a killed process's state
>>> before Cygwin cleans it up.
>>> One way around this *might* be to install a SIGCHLD handler in your
>>> Python
>>> script.  If that's possible, that should tell you when your child exits.
>> Perhaps the Python script is a red herring.  I just wrote it to
>> demonstrate the problem.  The difference between where I send stdout
>> to is strange, but you're likely right that it just comes down to
>> subtle timing differences.  Here's a C program that demonstrates the
>> same issue more reliably.  Interestingly, it works when I run it in
>> strace (probably just because of the strace overhead) but not when I
>> run it normally.
>> My point in all this is I'm confused why Cygwin would give up its
>> handles to the Windows process before wait() has been called.
>> (In fact, it's pretty confusing to have fopen returning EINVAL which
>> according to [1] it should only be doing if the mode string were
>> invalid.)
>> Thanks,
>> Erik
>> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/fopen.html
> O.K., you may be on to something amiss in the Cygwin DLL.  Thanks for the
> STC in C; that'll help somebody looking further at this.  I'm out of ideas.
> It might be possible to reduce strace overhead somewhat by selecting a
> smaller set of trace options than the default.

Note: My previous test program had a bug in do_child() (not correctly
terminating the argv array).  The attached program fixes the bug.
I've also attached a (truncated) strace log from it.
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <string.h>
#include <signal.h>
#include <fcntl.h>
#include <sys/wait.h>
#include <sys/errno.h>

int do_parent(pid_t);
void do_child(int);

int main(void) {
    int devnul;
    pid_t pid;

    devnul = open("/dev/null", O_WRONLY);
    pid = fork();
    if (pid) {
        /* Parent */
        return do_parent(pid);
    } else {
        /* Child */

int do_parent(pid_t child_pid) {
    FILE *f;
    char buf[120];
    char filename[20];

    printf("child pid: %d\n", child_pid);
    printf("sending SIGKILL\n");
    kill(child_pid, SIGKILL);
    sprintf(filename, "/proc/%d/stat", child_pid);
    printf("reading %s\n", filename);
    f = fopen(filename, "r");
    if (f == NULL) {
        printf("fopen error [%d]: %s\n", errno, strerror(errno));
        if (!access(filename, R_OK)) {
            printf("but the file exists and is readable\n");
    } else {
        fread(buf, sizeof(char), 120, f);
    return wait4(child_pid, NULL, 0, NULL);

void do_child(int out) {
    char *argv[2] = { "/usr/bin/yes", NULL };
    dup2(out, 1);
    execv(argv[0], argv);

Attachment: test.exe.strace
Description: Binary data

