Mailing-List: contact cygwin-developers-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-developers-owner AT cygwin DOT com Delivered-To: mailing list cygwin-developers AT cygwin DOT com Date: Sun, 29 Sep 2002 14:56:04 -0400 From: Christopher Faylor To: cygwin-developers AT cygwin DOT com Subject: Re: Many pthread failures in the test suite, one setgroup failure Message-ID: <20020929185604.GA25789@redhat.com> Reply-To: cygwin-developers AT cygwin DOT com Mail-Followup-To: cygwin-developers AT cygwin DOT com References: <20020929000215 DOT GB10872 AT redhat DOT com> <1033264646 DOT 4375 DOT 78 DOT camel AT lifelesswks> <20020929020609 DOT GB11549 AT redhat DOT com> <1033265603 DOT 4374 DOT 95 DOT camel AT lifelesswks> <20020929022338 DOT GA12659 AT redhat DOT com> <1033267203 DOT 4374 DOT 100 DOT camel AT lifelesswks> <20020929024420 DOT GA13416 AT redhat DOT com> <1033271512 DOT 4372 DOT 102 DOT camel AT lifelesswks> <20020929141659 DOT GA23836 AT redhat DOT com> <1033309085 DOT 11273 DOT 69 DOT camel AT lifelesswks> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1033309085.11273.69.camel@lifelesswks> User-Agent: Mutt/1.4i On Mon, Sep 30, 2002 at 12:18:04AM +1000, Robert Collins wrote: >On Mon, 2002-09-30 at 00:16, Christopher Faylor wrote: > >> I've uploaded a pthread-condvar6.exe.bz2 to the same location. I assume >> that this must have something to do with gcc 3.2, too, since it fails >> so consistently for me. > >Ok, I'll pick it up and play with it tomorrow. It's bedtime for me now >though. AFAICT, the problem may be a known one: /* FIXME: there's a potential race with PTHREAD_MUTEX_INITALIZER: the mutex is not actually inited until the first use. So two threads trying to lock/trylock may collide. Solution: we need a global mutex on mutex creation, or possibly simply on all constructors that allow INITIALIZER macros. the lock should be very small: only around the init routine, not every test, or all mutex access will be synchronised. */ I don't know why it is being tickled now or why it is so consistent but if I initialize the mutex in condvar6.c, I don't see the problem. When the program fails, it is returning 22 (EINVAL) from pthread_cond_timedwait. if (!pthread_mutex::isGoodObject (themutex)) return EINVAL; I couldn't quite catch what was setting the EINVAL, for some reason, though. I rebuilt the DLL with debugging and set some breakpoints on likely "return EINVAL"s but they didn't get hit in this context. I did see an occasional EINVAL, though, and it seemed to be happening at __pthread_cond_dowait here: if (!pthread_mutex::isGoodObject (themutex)) return EINVAL; I couldn't tell if this was the result of a debugging artifact or if it was an actual problem but in this context themutex == cond, which certainly seems wrong. This is unfortunately just speculation so far since I haven't been able to catch what's going on. It does seem to be timing related since stopping and single stepping often seems to cause the program to "just work". Anyway, that's my current brain dump. cgf