Mail Archives: cygwin/2006/03/16/15:14:17

delorie.com/archives/browse.cgi

search

Mail Archives: cygwin/2006/03/16/15:14:17

X-Spam-Check-By: sourceware.org

MIME-Version: 1.0

Subject: Re: Shells hang during script execution

Date: Thu, 16 Mar 2006 15:14:03 -0500

Message-ID: <B6C33E7A8278A0408B707C9B491720D4045298@STEELPO.steeleye.com>

From: "Ernie Coskrey" <Ernie DOT Coskrey AT steeleye DOT com>

To: <cygwin AT cygwin DOT com>

X-IsSubscribed: yes

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm

List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>

List-Archive: <http://sourceware.org/ml/cygwin/>

List-Post: <mailto:cygwin AT cygwin DOT com>

List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>

Sender: cygwin-owner AT cygwin DOT com

Mail-Followup-To: cygwin AT cygwin DOT com

Delivered-To: mailing list cygwin AT cygwin DOT com

X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id k2GKEE7h022912

>On Wed, Mar 01, 2006 at 01:01:46PM -0500, Ernie Coskrey wrote:
>>>>Here's a description of a second hang condition we were encountering, along 
>>>>with a patch for it.
>>>>
>>>>
>>>>The application (pdksh in this case) does a read on a pipe, which eventually 
>>>>calls pipe.cc fhandler_pipe::read in Thread 1.  This creates a new cygthread 
>>>>with "read_pipe()" as the function.  Then >it calls th->detach(read_state).
>>>>
>>>>When the hang occurs, the new thread gets terminated early, before
>>>>cygthread::stub() can call "callfunc()".  You see the error message
>>>>"erroneous thread activation".  I'm not sure what's causing the thread
>>>>to fail activation, but the result is, the read_state semaphore never
>>>>gets signalled.
>>>
>>>Sorry but this is another band-aid around a problem.  The real problem
>>>is that the code shouldn't get into the state that you are describing.
>>>That's why cygwin prints an error message - it is a serious problem.
>>>Making the code deal gracefully with a problem like this isn't going
>>>to solve the underlying issue.
>>>
>>>If you can figure out what's causing the erroneous thread activation
>>>then that will be the real culprit.
>>>
>>>cgf
>>>
>>
>>OK, I believe I've tracked this down.
>>
>>The problem occurs when we get into a read_pipe cygthread constructor
>>(cygthread::cygthread()) with a NULL h and an ev that is signalled.
>>When this condition exists, a hang can occur as follows:
>>
>>1) Creator thread calls detach().  This waits for pipe_state to be released twice
>>2) read_pipe thread calls read_pipe, reads data, and releases the semaphore twice
>>3) Creator thread goes to WFSO(*this, INFINITE) which returns immediately because ev was set when the thread was created.
>>4) Creator thread initiates another read_pipe cygthread to read more pipe data.
>>
>>At this point, there's a race: if the Creator thread gets past the
>>initialization part of the constuctor, which sets __name(name), BEFORE
>>the original read_pipe thread gets to the part of cygthread::stub()
>>that sets info->__name = NULL, then you'll see the hang.  The new
>>pipe_read will give the "erroneous thread activation" message, and the
>>parent will be stuck waiting for data that will never arrive.
>>
>>The only path that leaves an unused thread structure in a state where
>>h==NULL and ev is signalled is cygthread::release().  So the fix is
>>simple:
>>
>>$ cat cygthread.cc.udiff
>>--- cygthread.cc.ORIG   2006-02-22 10:57:42.123931300 -0500
>>+++ cygthread.cc        2006-03-01 12:59:23.255023000 -0500
>>@@ -268,7 +268,12 @@
>> cygthread::release (bool nuke_h)
>> {
>>   if (nuke_h)
>>+    {
>>     h = NULL;
>>+
>>+    if (ev)
>>+      ResetEvent (ev);
>>+    }
>> #ifdef DEBUGGING
>>   __oldname = __name;
>>   debug_printf ("released thread '%s'", __oldname);
>
>Nice analysis.  Thank you.  I think it's easier to fix this by just
>making the ev event auto-reset then this condition would be caught in
>terminate thread, as it was meant to be.
>
>cgf

Here's a patch for the problem that works with the latest snapshot.

-----
Ernie Coskrey       SteelEye Technology, Inc.



--- cygthread.cc.ORIG	2006-03-01 17:40:44.000000000 -0500
+++ cygthread.cc	2006-03-16 14:54:04.148312500 -0500
@@ -78,7 +78,7 @@
       debug_printf ("thread '%s', id %p, stack_ptr %p", info->name (), info->id, info->stack_ptr);
       if (!info->ev)
 	{
-	  info->ev = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
+	  info->ev = CreateEvent (&sec_none_nih, FALSE, FALSE, NULL);
 	  info->thread_sync = CreateEvent (&sec_none_nih, FALSE, FALSE, NULL);
 	}
     }
@@ -197,8 +197,6 @@
   HANDLE htobe;
   if (h)
     {
-      if (ev)
-	ResetEvent (ev);
       while (!thread_sync)
 	low_priority_sleep (0);
       SetEvent (thread_sync);
@@ -223,7 +221,6 @@
       while (!ev)
 	low_priority_sleep (0);
       WaitForSingleObject (ev, INFINITE);
-      ResetEvent (ev);
     }
   h = htobe;
 }


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -

webmaster	delorie software privacy
Copyright © 2019 by DJ Delorie	Updated Jul 2019

X-Spam-Check-By:	sourceware.org
MIME-Version:	1.0
Subject:	Re: Shells hang during script execution
Date:	Thu, 16 Mar 2006 15:14:03 -0500
Message-ID:	<B6C33E7A8278A0408B707C9B491720D4045298@STEELPO.steeleye.com>
From:	"Ernie Coskrey" <Ernie DOT Coskrey AT steeleye DOT com>
To:	<cygwin AT cygwin DOT com>
X-IsSubscribed:	yes
Mailing-List:	contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe:	<mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive:	<http://sourceware.org/ml/cygwin/>
List-Post:	<mailto:cygwin AT cygwin DOT com>
List-Help:	<mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender:	cygwin-owner AT cygwin DOT com
Mail-Followup-To:	cygwin AT cygwin DOT com
Delivered-To:	mailing list cygwin AT cygwin DOT com
X-MIME-Autoconverted:	from quoted-printable to 8bit by delorie.com id k2GKEE7h022912