Web lists-archives.com

Re: Hangs on connect to UNIX socket being listened on in the same process (was: Cygwin hanging in pselect)




Hi Erik,

On Jan  9 14:29, Erik Bray wrote:
> On Mon, Jan 9, 2017 at 12:01 PM, Erik Bray <erik.m.bray@xxxxxxxxx> wrote:
> > On Fri, Jan 6, 2017 at 12:40 PM, Erik Bray <erik.m.bray@xxxxxxxxx> wrote:
> >> Hello, and happy new-ish year,
> >>
> >> I've been working on and off over the past few months on bringing
> >> Python's compatibility with Cygwin up to snuff, including having all
> >> pertinent tests passing.  I've noticed that there are several tests
> >> (which I currently skip) that cause the process to hang indefinitely,
> >> and not respond to any signals from Cygwin (it can only be killed from
> >> Windows).  This is Cygwin 64-bit--I have not tested 32-bit.
> >> [...]
> > I made a little bit of progress debugging this, but now I'm stumped.
> > It seems the problem is this:
> >
> > For each socket whose fd is passed to select() a thread_socket is
> > started which calls peek_socket until there are bits ready on the

Yes and no.  One thread_socket is called per 62 sockets, to account
for the maximum number of handles per WaitForMultipleObjects call.

> > socket, or until the timeout is reached.  This in turn calls
> > fhandler_socket::evaluate_events.
> > [...]
> After playing around with this a bit more I came up with a much
> simpler example.  This has nothing to do with select( ) at all,
> directly.

Right.  It has to do with how connect/accept works on AF_LOCAL sockets.
The handshake doesn't work well for situations like yours, where the
same thread tries to connect and accept on the same socket.

This has been found a problem in porting postfix already and at the time
we added a patch to circumvent the problem.  Before calling connect, add
this:

  setsockopt (sock_server, SOL_SOCKET, SO_PEERCRED, NULL, 0);
  setsockopt (sock_client, SOL_SOCKET, SO_PEERCRED, NULL, 0);

This is, of course, a hack.  The problem here is that server and client
of a socket are independent of each other, and there's typically no
way to know which process created the server side unless you already
are connected.  Chicken/egg.

While replying to your mail, a thought occured to me, though.

We might get away without the above setsockopt calls by adding a check
to connect.  It could test if the socket has already been opened by
the same process and is bound.  This could be accomplished by scanning
the file descriptor table (dtable) of the process.  If we find it,
we set the above socket option on both ends and continue without the
secret and credential check.  Credentials could be set manually since we
know user, group, and pid at this point.

It's a bit of work but might be feasible.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

Attachment: signature.asc
Description: PGP signature