Web lists-archives.com

Cygwin hanging in pselect




Hello, and happy new-ish year,

I've been working on and off over the past few months on bringing
Python's compatibility with Cygwin up to snuff, including having all
pertinent tests passing.  I've noticed that there are several tests
(which I currently skip) that cause the process to hang indefinitely,
and not respond to any signals from Cygwin (it can only be killed from
Windows).  This is Cygwin 64-bit--I have not tested 32-bit.

I finally looked into this problem and found the lockup to be in
pselect() somewhere.  Attached I've provided the most minimal example
I've been able to come up with so far that reproduces the problem,
which I'll describe in a bit more detail next. I would attach a
cygcheck output if requested, but I was also able to reproduce this on
a recent build from source.

So far as I've been able to tell, the problem only occurs with AF_UNIX
sockets.  In the example I have a 'server' socket and a 'client'
socket both set to non-blocking.  The client connects to the socket,
returning errno EINPROGRESS as expected.  Then I do a pselect on the
client socket to wait until it is ready to be read from.  The hang
only happens when I pselect on the client socket, and not on the
server socket.  It doesn't seem to make a difference what the timeout
is.  One thing I have no tried is if the client and server are
actually different processes, but the example from the Python tests
this is reproducing is where they are both in the same process.

Below is (I think) the most relevant output from strace on the test
case.  It seems to hang somewhere in socket_cleanup, but I haven't
investigated any further than that.

Thanks,
Erik

  261   14732 [main] poll_test 79200 cygwin_socket: socket (1, 1 (flags 0x0), 0)
--- Process 79200 loaded C:\Windows\System32\ws2_32.dll at 00007FFF8D5D0000
  985   15717 [main] poll_test 79200 wsock_init: res 0
   21   15738 [main] poll_test 79200 wsock_init: wVersion 514
   15   15753 [main] poll_test 79200 wsock_init: wHighVersion 514
   12   15765 [main] poll_test 79200 wsock_init: szDescription WinSock 2.0
   13   15778 [main] poll_test 79200 wsock_init: szSystemStatus Running
   17   15795 [main] poll_test 79200 wsock_init: iMaxSockets 0
   16   15811 [main] poll_test 79200 wsock_init: iMaxUdpDg 0
--- Process 79200 loaded C:\Windows\System32\mswsock.dll at 00007FFF89540000
  557   16368 [main] poll_test 79200 build_fh_pc: fh 0x18030BE70, dev 001E0079
   87   16455 [main] poll_test 79200 fhandler_base::set_flags: flags
0x10002, supplied_bin 0x0
   18   16473 [main] poll_test 79200 fhandler_base::set_flags:
O_TEXT/O_BINARY set in flags 0x10000
   15   16488 [main] poll_test 79200 fhandler_base::set_flags:
filemode set to binary
   15   16503 [main] poll_test 79200 fdsock: fd 3, name '', soc 0x180
   21   16524 [main] poll_test 79200 cygwin_socket: 3 = socket(1, 1
(flags 0x0), 0)
   15   16539 [main] poll_test 79200 fcntl64: fcntl(3, 3, ...)
   17   16556 [main] poll_test 79200 fhandler_base::fcntl: GETFL: 0x10002
   15   16571 [main] poll_test 79200 fcntl64: 65538 = fcntl(3, 3, 0x0)
   16   16587 [main] poll_test 79200 fcntl64: fcntl(3, 4, ...)
   22   16609 [main] poll_test 79200 fhandler_socket::ioctl: socket is
now nonblocking
   22   16631 [main] poll_test 79200 fhandler_socket::ioctl: 0 =
ioctl_socket(8004667E, 0xFFFFC9FC)
   17   16648 [main] poll_test 79200 fhandler_base::set_flags: flags
0x14002, supplied_bin 0x0
   18   16666 [main] poll_test 79200 fhandler_base::set_flags:
O_TEXT/O_BINARY set in flags 0x10000
   15   16681 [main] poll_test 79200 fhandler_base::set_flags:
filemode set to binary
   15   16696 [main] poll_test 79200 fcntl64: 0 = fcntl(3, 4, 0x14002)
   21   16717 [main] poll_test 79200 normalize_posix_path: src @test.sock

   ... snip path checking stuff...

   19   17118 [main] poll_test 79200 path_conv::check:
this->path(C:\Users\Erik M. Bray\src\python\cpython\@test.sock),
has_acls(1)
   70   17188 [main] poll_test 79200 fhandler_socket::bind: AF_LOCAL:
socket bound to port 55085
  298   17486 [main] poll_test 79200 set_posix_access: ACL-Size: 100
   37   17523 [main] poll_test 79200 set_posix_access: Created SD-Size: 176
--- Process 79200 loaded C:\Windows\System32\cryptbase.dll at 00007FFF898B0000
--- Process 79200 loaded C:\Windows\System32\bcryptprimitives.dll at
00007FFF8AE30000
 3492   21015 [main] poll_test 79200 cygwin_bind: 0 = bind(3, 0xFFFFCB10, 110)
  112   21127 [main] poll_test 79200 getpid: 79200 = getpid()
   27   21154 [main] poll_test 79200 cygwin_listen: 0 = listen(3, 5)
   21   21175 [main] poll_test 79200 cygwin_socket: socket (1, 1 (flags 0x0), 0)
   68   21243 [main] poll_test 79200 build_fh_pc: fh 0x18030C310, dev 001E0079
   44   21287 [main] poll_test 79200 fhandler_base::set_flags: flags
0x10002, supplied_bin 0x0
   15   21302 [main] poll_test 79200 fhandler_base::set_flags:
O_TEXT/O_BINARY set in flags 0x10000
   13   21315 [main] poll_test 79200 fhandler_base::set_flags:
filemode set to binary
   13   21328 [main] poll_test 79200 fdsock: fd 4, name '', soc 0x188
   20   21348 [main] poll_test 79200 cygwin_socket: 4 = socket(1, 1
(flags 0x0), 0)
   16   21364 [main] poll_test 79200 fcntl64: fcntl(4, 3, ...)
   15   21379 [main] poll_test 79200 fhandler_base::fcntl: GETFL: 0x10002
   13   21392 [main] poll_test 79200 fcntl64: 65538 = fcntl(4, 3, 0x0)
   17   21409 [main] poll_test 79200 fcntl64: fcntl(4, 4, ...)
   14   21423 [main] poll_test 79200 fhandler_socket::ioctl: socket is
now nonblocking
   14   21437 [main] poll_test 79200 fhandler_socket::ioctl: 0 =
ioctl_socket(8004667E, 0xFFFFC9FC)
   13   21450 [main] poll_test 79200 fhandler_base::set_flags: flags
0x14002, supplied_bin 0x0
   13   21463 [main] poll_test 79200 fhandler_base::set_flags:
O_TEXT/O_BINARY set in flags 0x10000
   13   21476 [main] poll_test 79200 fhandler_base::set_flags:
filemode set to binary
   12   21488 [main] poll_test 79200 fcntl64: 0 = fcntl(4, 4, 0x14002)
   20   21508 [main] poll_test 79200 normalize_posix_path: src @test.sock

...

   76   21922 [main] poll_test 79200 getpid: 79200 = getpid()
--- Process 79200 thread 18528 created
  350   22272 [main] poll_test 79200 __set_errno: void
__set_winsock_errno(const char*, int):224 setting errno 119
   26   22298 [main] poll_test 79200 __set_winsock_errno: connect:1232
- winsock error 10036 -> errno 119
   17   22315 [main] poll_test 79200 cygwin_connect: -1 = connect(4,
0xFFFFCB10, 110), errno 119
  100   22415 [main] poll_test 79200 time: 1483702462 = time(0x0)
  731   23146 [main] poll_test 79200
pwdgrp::fetch_account_from_windows: line:
<Administrators:S-1-5-32-544:544:>
  108   23254 [main] poll_test 79200 stat64: entering
   32   23286 [main] poll_test 79200 normalize_posix_path: src /dev
   30   23316 [main] poll_test 79200 normalize_posix_path: /dev =
normalize_posix_path (/dev)
   26   23342 [main] poll_test 79200 mount_info::conv_to_win32_path:
conv_to_win32_path (/dev)
   27   23369 [main] poll_test 79200 set_flags: flags: binary (0x2)
   23   23392 [main] poll_test 79200 mount_info::conv_to_win32_path:
src_path /dev, dst C:\cygwin64\dev, flags 0x3000A, rc 0
   54   23446 [main] poll_test 79200 symlink_info::check: 0x0 =
NtCreateFile (\??\C:\cygwin64\dev)
   37   23483 [main] poll_test 79200 symlink_info::check: not a symlink
   29   23512 [main] poll_test 79200 symlink_info::check: 0 =
symlink.check(C:\cygwin64\dev, 0xFFFFB250) (0x43000A)
   39   23551 [main] poll_test 79200 build_fh_pc: fh 0x18030C700, dev 000000C1
   30   23581 [main] poll_test 79200 stat_worker:
(\??\C:\cygwin64\dev, 0x1802E2A20, 0x18030C700), file_attributes 16
  194   23775 [main] poll_test 79200 fhandler_base::fstat_helper: 0 =
fstat (\??\C:\cygwin64\dev, 0x1802E2A20) st_size=0, st_mode=040755,
st_ino=562949953536516st_atim=57696141.1B43B4B0
st_ctim=57696141.1B43B4B0 st_mtim=57696141.1B43B4B0
st_birthtim=5769612C.1F3F08BC
   44   23819 [main] poll_test 79200 stat_worker: 0 =
(\??\C:\cygwin64\dev,0x1802E2A20)
   38   23857 [main] poll_test 79200 fstat64: 0 = fstat(1, 0xFFFFC580)
   48   23905 [main] poll_test 79200 isatty: 1 = isatty(1)
  154   24059 [main] poll_test 79200 fhandler_pty_slave::write: pty9,
write(0x6000426B0, 40)
   51   24110 [main] poll_test 79200
fhandler_pty_common::process_opost_output: (1901): pty output_mutex
(0x140): waiting -1 ms
   47   24157 [main] poll_test 79200
fhandler_pty_common::process_opost_output: (1901): pty output_mutex:
acquired
   23   24180 [main] poll_test 79200
fhandler_pty_common::process_opost_output: (1940): pty
output_mutex(0x140) released
Ret from client connect: -1; errno: 119
   23   24203 [main] poll_test 79200 write: 40 = write(1, 0x6000426B0, 40)
   99   24302 [main] poll_test 79200 pselect: pselect (5, 0xFFFFCB90,
0x0, 0x0, 0xFFFFCB80, 0x0)
   20   24322 [main] poll_test 79200 pselect: to->tv_sec 1,
to->tv_nsec 0, us 1000000
   41   24363 [main] poll_test 79200 dtable::select_read:  fd 4
   33   24396 [main] poll_test 79200 select: sel.always_ready 0
   97   24493 [main] poll_test 79200 start_thread_socket: stuff_start 0xFFFFC8E8
--- Process 79200 thread 46528 created
  172   24665 [socksel] poll_test 79200 cygthread::stub: thread
'socksel', id 0xB5C0, stack_ptr 0x12ECCD0
   24   24689 [socksel] poll_test 79200 thread_socket: stuff_start
0xFFFFC8E8, timeout 4294967295
   23   24712 [main] poll_test 79200 select_stuff::wait: m 4, us
1000000, wmfo_timeout -1
   31   24743 [socksel] poll_test 79200
fhandler_socket::af_local_connect: af_local_connect called,
no_getpeereid=0
  106   24849 [socksel] poll_test 79200
fhandler_socket::af_local_send_secret: Sending af_local secret
succeeded
999984 1024833 [main] poll_test 79200 select_stuff::wait: wait_ret 3,
m = 4.  verifying
   65 1024898 [main] poll_test 79200 select_stuff::wait: timed out
   38 1024936 [main] poll_test 79200 select_stuff::wait: returning 1
   31 1024967 [main] poll_test 79200 select: sel.wait returns 1
   18 1024985 [main] poll_test 79200 select_stuff::cleanup: calling
cleanup routines
   16 1025001 [main] poll_test 79200 socket_cleanup: si 0x6000526C0
si->thread 0x1801FE758
#include <arpa/inet.h>
#include <sys/socket.h>
#include <string.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/time.h>
#include <sys/un.h>
#include <errno.h>


#define SET_NONBLOCKING(sock) fcntl(sock, F_SETFL, fcntl(sock, F_GETFL, 0) | O_NONBLOCK);


int main(void) {
    fd_set rfds;
    struct timespec tv;
    int sock_server, sock_client;
    int retval;
    struct sockaddr_un addr; \

    memset(&addr, 0, sizeof(addr)); \
    addr.sun_family = AF_UNIX; \
    strcpy(addr.sun_path, "@test.sock");

    sock_server = socket(AF_UNIX, SOCK_STREAM, 0);
    SET_NONBLOCKING(sock_server);
    if (bind(sock_server, (struct sockaddr*)&addr, sizeof(addr))) {
        printf("binding server socket failed");
        return 1;
    }
    listen(sock_server, 5);

    sock_client = socket(AF_UNIX, SOCK_STREAM, 0);
    SET_NONBLOCKING(sock_client);
    retval = connect(sock_client, (struct sockaddr*)&addr, sizeof(addr));
    printf("Ret from client connect: %d; errno: %d\n", retval, errno);

    FD_ZERO(&rfds);
    FD_SET(sock_client, &rfds);
    tv.tv_sec = 1;
    tv.tv_nsec = 0;

    retval = pselect(sock_client + 1, &rfds, NULL, NULL, &tv, NULL);

    return 0;
}
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple