Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com content-class: urn:content-classes:message Subject: RE: 1.1.3 and upwards: apparent bug with pthread_cond_wait() and/or signal() MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Thu, 2 May 2002 01:37:34 +1000 X-MimeOLE: Produced By Microsoft Exchange V6.0.5762.3 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: From: "Robert Collins" To: "Michael Beach" Cc: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id g41FcWB12810 > -----Original Message----- > From: Michael Beach [mailto:michaelb AT ieee DOT org] > Sent: Thursday, May 02, 2002 12:21 AM > Thanks for taking the time to look at this issue, but I must > disagree that > this is the problem. You're going to have to debug this yourself. I've given you my opinion :]. > If the test thread locks the mutex first, sure it will > probably signal before > the main thread is wating, but that doesn't matter because > the main thread does this sequence look plausible to you? I don't claim it is whats happening because the string output doesn't fit.. but it illustrates the race. On a dual processor machine this is much more likely than a single. thread - lock thread - state=run thread - signal main - lock main - test state (passes) thread - test state (fails) main - state = acknowledged main - signal thread wait main - unlock main - join thread is hung. what are we seeing: main - lock main - test state fails main - wait thread - lock thread - state=run thread - signal -- test thread has signal()ed thread - test state (fails) -- test thread about to wait()... thread wait -- main thread wakes! main - state = acknowledged -- main thread about to signal() main - signal main - unlock -- main thread waiting for exit... thread should wake here. > > If the above hand-wavy explanation does not seem convincing, ... > the different platforms does not seem to hold much water... Without a few more output statements, I'll not buy into that. However I do accept your hand waving. Particularly since I've noticed something useful out of this: pthread_join's argument should not be 0. I have to dig up the spec to confirm this though.... but our code will segfault like crazy on you as it stands. > However, that said, I will be trying 1.3.10 to see if it > makes a difference. > If not, then I guess I will just have to make the move to the > Win32 threading > and synchronization APIs. Blech! You could always help us debug the pthreads code... I wonder if the recent patches I haven't reviewed properly yet address this. If you had time, you could try them and see... > > You should also _always_ test for the return value when > using pthreads > > calls. They don't throw exceptions and they don't set errno, so the > > only way you can tell an error has occurred is to record the return > > value. > > Yes I know. The reason for this sloppy coding is that this > test program is > ... Please don't remove error handling. If I were to run this program I'd expect to have error handling so I don't have to add it in. And running the code w/o error handling won't help me id anything non-trivial. Rob (Cygwin pthreads maintainer). -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/