Mail Archives: cygwin-developers/2001/09/23/23:57:10
> -----Original Message-----
> From: Jason Tishler [mailto:jason AT tishler DOT net]
> Sent: Monday, September 24, 2001 1:46 PM
>
> Rob,
>
> On Mon, Sep 24, 2001 at 12:54:33PM +1000, Robert Collins wrote:
> > > -----Original Message-----
> > > From: Jason Tishler [mailto:jason AT tishler DOT net]
> > >
> > > While trying to release my first threaded Python
> > > distribution, I believe
> > > that I have found another pthreads hang. For those
> > > interested, see the
> > > attached for a gdb session.
> >
> > This is a *known* race condition. It has always existed on
> 9x, and I had
> > no choice for NT about introducing it.
>
> I knew this and then forgot (sorry). Thanks for reminding me.
>
> > I'm working on some upgrades to
> > the muto object which will _hopefully_ allow a correct fix for both
> > platforms.
> >
> > The race is that an event can get missed if one thread is entering a
> > wait just when another signals.
>
> I'm leaning toward holding off releasing a threaded Python until your
> muto upgrade in complete. Do you concur?
There's more than the muto change to make it "good". The second
statement (the wait) and other threads calling the signal() clause need
to be protected from each other. What that requires is a lock _that is
reset when the wait function is called_. This does not exist on 95 at
all (No SignalObjectAndWait). On NT that cannot be done for
CriticalSections at all, so I'm going to have to find somewaht to create
SignalMutoAndWait. I've some ideas, but nothing concrete just yet. (Any
realtime programmers want to pop up and offer some ring 3 assembler to
achieve this?)
I'm currently thinking of something ugly involving a semaphore (only 1
thread in a wait at a time), an interlocked counter (increment _before
testing the semaphore to indicate that a thread wants into the wait
state), an event (the waiting thread has woken up, gate another thread
into a wait condition) and a second event (a signal has occured, wake
any waiting threads up).
The advantage of this is that the signal-and-wait exercise is no longer
racy - we have a gateway into the wait state as well as the
allowed-to-alter-conditionvariable state. The downside is _much_ more
synchronisation. However, semaphores should be pretty fast, especially
when comopared to mutex's.
So it's up to you really. I don't know when I'll get it done, (all my
current hack time is on the daemon/setup). How often does it occur? Is
it in the test suite only?
Rob
- Raw text -