Web lists-archives.com

Re: Fork issue with timerfd




On Feb 24 19:55, Corinna Vinschen wrote:
> On Feb 24 17:27, Ken Brown wrote:
> > I'm seeing sporadic errors like this on 64-bit Cygwin when I first start emacs:
> > 
> >        0 [main] emacs-X11 864 C:\cygwin64\bin\emacs-X11.exe: *** fatal error in 
> > forked process - Can't recreate shared timerfd section during fork!
> >        0 [main] emacs 860 dofork: child 864 - died waiting for dll loading, errno 11
> > 
> > If I exit and restart, everything will be fine almost every time.
> 
> I think I see where the thinko was here.  Can you try this?

No, better try this:

diff --git a/winsup/cygwin/timerfd.cc b/winsup/cygwin/timerfd.cc
index 7e6be72b225a..a587926eed28 100644
--- a/winsup/cygwin/timerfd.cc
+++ b/winsup/cygwin/timerfd.cc
@@ -408,6 +408,7 @@ void
 timerfd_tracker::fixup_after_fork_exec (bool execing)
 {
   NTSTATUS status;
+  PVOID base_address = NULL;
   OBJECT_ATTRIBUTES attr;
   SIZE_T vsize = PAGE_SIZE;
 
@@ -416,11 +417,12 @@ timerfd_tracker::fixup_after_fork_exec (bool execing)
     return;
   /* Recreate shared section mapping */
   status = NtMapViewOfSection (tfd_shared_hdl, NtCurrentProcess (),
-			       (void **) &tfd_shared, 0, PAGE_SIZE, NULL,
-			       &vsize, ViewShare, MEM_TOP_DOWN, PAGE_READWRITE);
+			       &base_address, 0, PAGE_SIZE, NULL,
+			       &vsize, ViewShare, 0, PAGE_READWRITE);
   if (!NT_SUCCESS (status))
-    api_fatal ("Can't recreate shared timerfd section during %s!",
-	       execing ? "execve" : "fork");
+    api_fatal ("Can't recreate shared timerfd section during %s, status %y!",
+	       execing ? "execve" : "fork", status);
+  tfd_shared = (timerfd_shared *) base_address;
   /* Increment global instance count by the number of instances in this
      process */
   InterlockedAdd (&tfd_shared->instance_count, local_instance_count);

To explain:

The memory address of the shared region doesn't matter at all, so
fixup_after_fork_exec just has to re-open the shared timerfd region
anywhere.  But here's the deal per MSDN: "If the value of [BaseAddress]
is not NULL, the view is allocated starting at the specified virtual
address [...]"

But I made a mistake: I gave the pointer to &tfd_shared to
NtMapViewOfSection, which is only NULL in the parent, while it has the
parent's value in the child process.  Combined with the MEM_TOP_DOWN
allocation, this may collide with mmapped regions from the parent.

So the above code now always sets BaseAddress to NULL in the call to
NtMapViewOfSection and only copies over the address of the mapping
to the tfd_shared pointer afterwards.  It also drops the MEM_TOP_DOWN
allocation which doesn't make any sense here.  And last but not least,
it prints the status code on failure for future debugging.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer

Attachment: signature.asc
Description: PGP signature