X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Wed, 22 Jul 2009 12:08:04 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: flock still buggy Message-ID: <20090722100804.GF27613@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.19 (2009-02-20) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Jul 21 23:26, Eric Blake wrote: > I finally figured out why autoconf is still failing its flock-related tests, > and why perl was reliably failing even though my simple attempts in C were > always passing. It turns out that if you do: > > open > flock(LOCK_EX) > if (!fork) > execlp("sleep","sleep","10",NULL); > sleep(10); > > then ProcessExplorer shows that the Event in the global namespace of flock-dev- > ino\20-2-* exists in both parent and child, and with a notification level of > false, blocking any outside influence until both the parent and forkee exit. > But if you do: > > open > fcntl (fd, F_SETFD, FD_CLOEXEC | fcntl (fd, F_GETFD)) > flock(LOCK_EX) > if (!fork) > execlp("sleep","sleep","10",NULL); > sleep(10); > > then only the parent holds a handle to the Event, but with a notification level > of true, allowing any outside party to do whatever they want. Do you have a working C testcase to demonstrate this? > I'm still trying to figure out why the close-on-exec cleanup appears to be > spuriously triggering the flock Event to unlock. But my understanding is that > F_FLOCK locks should survive over exec, so the close-on-exec cleanup should > only trigger lock release on F_POSIX locks. I have a hunch it's a thinko in fhandler_base::del_my_locks. In case of close_on_exec, the underlying file handle is already invalid. Here's a question: If you strace this, do you get a debug message from get_obj_handle_count with a status code from NtQueryObject? Does the below patch fix the problem? Does the reasoning sound... reasonable? Index: flock.cc =================================================================== RCS file: /cvs/src/src/winsup/cygwin/flock.cc,v retrieving revision 1.23 diff -u -p -r1.23 flock.cc --- flock.cc 14 Jul 2009 17:37:42 -0000 1.23 +++ flock.cc 22 Jul 2009 10:07:17 -0000 @@ -350,8 +350,18 @@ fhandler_base::del_my_locks (bool after_ inode_t *node = inode_t::get (get_dev (), get_ino (), false); if (node) { + /* In the close-on-exec case, our io handle is already invalid. + We can't use it to test for the object reference count. + However, that shouldn't be necessary for the following reason. + After exec, there are no threads in the current process + waiting for the lock. So, either we're the only process + accessing the file table entry and there are no threads + which require signalling, or we have a parent process still + accessing the file object and signalling the lock event would + be premature. */ bool no_locks_left = - node->del_my_locks (after_fork ? 0 : get_unique_id (), get_handle ()); + node->del_my_locks (after_fork ? 0 : get_unique_id (), + close_on_exec () ? NULL : get_handle ()); if (no_locks_left) { LIST_REMOVE (node, i_next); Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple