X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Sat, 27 Aug 2011 22:37:06 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: STC for libapr1 failure Message-ID: <20110827203706.GA15411@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: <4E56EB24 DOT 5000505 AT acm DOT org> <20110826111509 DOT GH10490 AT calimero DOT vinschen DOT de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20110826111509.GH10490@calimero.vinschen.de> User-Agent: Mutt/1.5.21 (2010-09-15) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Aug 26 13:15, Corinna Vinschen wrote: > On Aug 25 17:39, David Rothenberger wrote: > > For a while now, the test cases that come with libapr1 have been > > bombing with this message: > > > > *** fatal error - NtCreateEvent(lock): 0xC0000035 > > > > I finally took some time to investigate and have extracted a STC > > that demonstrates the problem. > > Thanks a lot for the testcase. In theory, the NtCreateEvent call should > not have happened at all, since it's called under lock, and the code > around that should have made sure that the object doesn't exist at the > time. > > After a few hours of extrem puzzlement, I now finally know what happens. > It's kinda hard to explain. > > A lock on a file is represented by an event object. Process A holds the > lock corresponding with event a. Process B tries to lock, but the lock > of process A blocks that. So B now waits for event a, until it gets > signalled. Now A unlocks, thus signalling event a and closing the handle > afterwards. But A's time slice isn't up yet, so it tries again to lock > the file, before B returned from the wait for a. And here a wrong > condition fails to recognize the situation. It finds the event object, > but since it's recognized as "that's me", it doesn't treat the event as > a blocking factor. This in turn is the allowance to create its own lock > event object. However, the object still exists, since b has still an > open handle to it. So creating the event fails, and rightfully so. > > What I don't have is an idea how to fix this problem correctly. I have > to think about that. Stay tuned. Please test the latest snapshot. It should fix this problem, as well as a starvation problem with signals (and, fwiw, thread cancel events) in flock, lockf, and POSIX fcntl locks. Thanks again for the testcase. It was very helpful to test both problems. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple