delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2002/09/22/12:40:48

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Date: Sun, 22 Sep 2002 12:40:54 -0400
From: Christopher Faylor <cgf AT redhat DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: cvs cygwin1.dll
Message-ID: <20020922164054.GB25597@redhat.com>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <3d81aa1b DOT 1496411 AT smtp DOT ntlworld DOT com> <20020913125816 DOT GA1030 AT redhat DOT com> <3d88c6c1 DOT 17117483 AT smtp DOT ntlworld DOT com> <20020918193553 DOT GA9328 AT redhat DOT com> <3d8bf2b5 DOT 1540735 AT smtp DOT ntlworld DOT com> <20020920155657 DOT GH24740 AT redhat DOT com> <3d90daeb DOT 24161151 AT smtp DOT ntlworld DOT com>
Mime-Version: 1.0
In-Reply-To: <3d90daeb.24161151@smtp.ntlworld.com>
User-Agent: Mutt/1.4i

On Sun, Sep 22, 2002 at 03:08:41PM +0000, Guy Harrison wrote:
>On Fri, 20 Sep 2002 11:56:57 -0400, Christopher Faylor <cgf AT redhat DOT com>
>wrote:
>
>>On Fri, Sep 20, 2002 at 11:26:42AM +0000, Guy Harrison wrote:
>>>On Wed, 18 Sep 2002 15:35:53 -0400, Christopher Faylor <cgf AT redhat DOT com>
>>>wrote:
>
>Shame us non-developers can't get it "readonly".
>http://cygwin.com/ml/cygwin-developers/2002-09/msg00071.html

Yeah.  Life's a bitch, isn't it?

Anyway, you are looking at the wrong message.

Remember this?

On Fri, Sep 20, 2002 at 11:56:57AM -0400, Christopher Faylor wrote:
>I suspect that this is actualy due to a deadlock in the code init.cc
>which was recently discussed in cygwin-developers.

If you look at my message which immediately follows the one that you mention
which actually *mentions init.cc*, you will see the cause of at least one
deadlock-on-exit in cygwin.

Robert Collins has vowed to fix this problem this weekend.  Until then,
however, I have commented out the code in question.

>>I don't think it has anything to do with suspended threads.  You can
>>certainly verify this by adding code to kill the threads specifically,
>>though, and see what happens.
>
>I did. I declared threads[1]. All the work gets shoved onto
>cygthread::simplestub which neither suspends nor stays resident.

Not the same thing at all.

>Hung process:
>
>Name---------Pid-Pri-Thd--Hnd----Mem-----User-Time---Kernel-Time---Elapsed-Time
>sh-----------344---4---1---67---1832---0:00:00.020---0:00:00.080----0:02:29.935
>----------------------VM------WS---WS-Pk----Priv---Faults-NonP-Page-PageFile
>------------------351732----1832----1964----1476------492----3---21-----1476
>-Tid-Pri----Cswtch------------State-----User-Time---Kernel-Time---Elapsed-Time
>-548---4---------1---Wait:Suspended---0:00:00.000---0:00:00.000----0:02:29.825
>
>Relevent log:
>
>Quick Key:
><GetCurrentProcessId/GetCurrentThreadId> 90 GetCommandLine() chars
>[n/32] =threads[n] of NTHREADS=32
>mti    =main_thread_id
>nam    =ignore fixed on "mti" here
>sdc    =SD_count (member added to cygthread class) suspend count
>av     =threads[].avail
>id     =threads[].id
>h      =threads[].h
>sus    =another suspend count
>gle    =GetLastError() for failed "sus"
>
><344/509> cli(90):J:\cygwin\bin\sh.exe
>pid=344 tid=509[0/32]{mti:509}: nam=[main] sdc=-999 av=877 id=0 h=296
>sus=2 gle=0 
>pid=344 tid=509[1/32]{mti:509}: nam=[main] sdc=-999 av=212 id=0 h=300
>sus=2 gle=0 
>pid=344 tid=509[2/32]{mti:509}: nam=[main] sdc=-999 av=894 id=0 h=304
>sus=2 gle=0 
>pid=344 tid=509[3/32]{mti:509}: nam=[main] sdc=-999 av=482 id=0 h=308
>sus=2 gle=0 
>
>The ::SuspendThread() and ::ResumeThread() calls in cygthread.cc assign
>their result directly to SD_count. I set it explicity to silly negative
>values at these points:
>
>-999 in cygthread::runner() after their ::CreateThread()
>-99 in cygthread::stub just prior to init_exceptions()
>-2 cygthread::exit_thread ::SetEvent()
>-9999 cygthread::stub ::ExitThread()
>
>Nothing else touches 'SD_count'. The above output is generated by a
>function 'SD_DumpLiving()' inserted immediately prior to ::ExitProcess()
>within _pinfo::exit().
>
>Our hung process is definately suspended.

Not necessarily.  I see nothing in the above which would disprove the
theory that this is the problem which I raised in cygwin-developers.  Of
course, I am not 100% sure that I understand the above data.  However,
I'm not going to devote too much time to studying it since there is an
obvious problem in the cygwin DLL now and you haven't, AFAICT, addressed
that.  This makes your analysis suspect until you generate a version of
the DLL without the already known problem.

However, if you want to provide an actual analysis of how the thread
could be in a suspended state with someone waiting for it, that would be
welcome.  So far, everything you've provided points to the fact that the
process in question is stuck in a deadlock state during ExitProcess, which
sort of confirms my theory.  That's why you can't debug it.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019