delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin-developers/2002/09/08/12:43:10

Mailing-List: contact cygwin-developers-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-developers-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin-developers/>
List-Post: <mailto:cygwin-developers AT cygwin DOT com>
List-Help: <mailto:cygwin-developers-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-developers-owner AT cygwin DOT com
Delivered-To: mailing list cygwin-developers AT cygwin DOT com
Date: Sun, 8 Sep 2002 20:41:51 +0400
From: egor duda <deo AT logos-m DOT ru>
Reply-To: egor duda <cygwin-developers AT cygwin DOT com>
Organization: deo
X-Priority: 3 (Normal)
Message-ID: <195826558.20020908204151@logos-m.ru>
To: Christopher Faylor <cygwin-developers AT cygwin DOT com>
Subject: Re: hang in sig_wait waiting for debug lock
In-Reply-To: <20020906210506.GB26002@redhat.com>
References: <7710998905 DOT 20020828173811 AT logos-m DOT ru>
<20020905153320 DOT GC16827 AT redhat DOT com> <591948241 DOT 20020906185459 AT logos-m DOT ru>
<20020906151713 DOT GC21699 AT redhat DOT com> <1176882135 DOT 20020906201713 AT logos-m DOT ru>
<20020906210506 DOT GB26002 AT redhat DOT com>
MIME-Version: 1.0

Hi!

Saturday, 07 September, 2002 Christopher Faylor cgf AT redhat DOT com wrote:

CF> On Fri, Sep 06, 2002 at 08:17:13PM +0400, egor duda wrote:
>>Hi!
>>
>>Friday, 06 September, 2002 Christopher Faylor cgf AT redhat DOT com wrote:
>>
>>>>Changelog states, however, that setclexec stuff isn't needed. Yet i
>>>>can't see why we shouldn't process protected handle list as long as we
>>>>recreating handles during set-close-on-exec operation. Can you give a
>>>>comment?
>>
>>CF> I assume that you mean this entry:
>>
>>CF> 2002-07-14  Christopher Faylor  <cgf AT redhat DOT com>
>>
>>CF>         * dcrt0.cc (dll_crt0_1): Move debug_init call back to here.  Avoid a
>>CF>         compiler warning.
>>CF>         * shared.cc (memory_init): Remove debug_init call.
>>CF>         * debug.h (handle_list): Change "clexec" to "inherited".
>>CF>         * debug.cc: Remove a spurious declaration.
>>CF>         (setclexec): Conditionalize away since it is currently unused.
>>CF>         (add_handle): Use inherited field rather than clexec.
>>CF>         (debug_fixup_after_fork_exec): Ditto.  Move debugging output to
>>CF>         delete_handle.
>>CF>         (delete_handle): Add debugging output.
>>CF>         * fhandler.cc (fhandler_base::set_inheritance): Don't bother setting
>>CF>         inheritance in debugging table since the handle was never protected
>>CF>         anyway.
>>CF>         (fhandler_base::fork_fixup): Ditto.
>>
>>CF> I'm at a loss to understand why adding additional things into the
>>CF> protected handle table would solve a race.
>>
>>I thought about it again and here's a hypothesis of what may be
>>happening.
>>
>>I suspect that it's not exactly a race. I.e., it's caused not by
>>randomness in order in which different threads of control are
>>executed, but by randomness in which handles are allocated by OS.
>>If value of some handle allocated in one process is equal to value
>>of handle we were dealing with in other, we may got warnings from
>>add_handle.

CF> I don't see this either.  If the table is populated with handles from
CF> another process that don't exist in this one then that's a bug.

It's clearly so. And enabling setclexec() removes it. I can guess,
that it's not the way things were supposed to work, right?

>>system_printf is pumping data to STD_ERROR_HANDLE. It's possibly a
>>pipe to tty master. Handling data in tty master thread is quite
>>complicated, and may possibly get to the same add_handle() but with
>>muto already locked. Normally it's not a big problem since
>>system_printf() will return asynchronously to tty master and unlock
>>the mutex. But here we have the second nasty random thing that may
>>happen: The pipe may be filled up. In this case WriteFile in
>>system_printf blocked until master drain the data from pipe. And
>>master may be blocked because it wants to protect a handle but debug
>>muto is locked.
>>
>>I've noticed special here.unlock() before debug_printf() in
>>add_handle(). Could it be that it was added there for similar reasons?
>>If not, then it's not clear why we should unlock mute explicitly when
>>it will be unlocked in the next line when 'return' statement is
>>executed?

CF> Possibly.  Are you seeing system_printf output in your failing case?

Yes, but it looks that it not full. Actually, pipe hasn't to be filled
up. After WriteFile() we call FlushFileBuffers (). Documentation on
FlushFileBuffers states that if handle is a pipe then the operation
blocks until pipe peer reads _all_ data from pipe. Commenting
FlushFileBuffers did helped in my case -- system_printf()s are not
blocking process now.

Yet i suppose the right way is to always unlock debug muto as fast as
possible, i.e., before calling debug_printf() and stuff. As far as i
understand, muto is protecting handle list, so we must hold it while
and only while we're messing with it. That should be safe.

Egor.            mailto:deo AT logos-m DOT ru ICQ 5165414 FidoNet 2:5020/496.19

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019