delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2022/04/09/16:16:38

X-Recipient: archive-cygwin AT delorie DOT com
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B916938515F6
Authentication-Results: sourceware.org;
dmarc=pass (p=none dis=none) header.from=ispras.ru
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ispras.ru
MIME-Version: 1.0
Date: Sat, 09 Apr 2022 22:35:03 +0300
From: Alexey Izbyshev <izbyshev AT ispras DOT ru>
To: Takashi Yano <takashi DOT yano AT nifty DOT ne DOT jp>
Subject: Re: Deadlock of the process tree when running make
In-Reply-To: <20220410025410.196aa0a04368147dbbb31d3e@nifty.ne.jp>
References: <9388316255ada0e0fcb2d849cce5a894 AT ispras DOT ru>
<20220409191743 DOT 6da2268a36e8c9b4ab22c722 AT nifty DOT ne DOT jp>
<1ecd670b1cdff43e0b0d7e5ee4c9cfc5 AT ispras DOT ru>
<ab3971adb8f441fd16bb62e480547a95 AT ispras DOT ru>
<20220409204619 DOT dd0e53902d5e108ef462e510 AT nifty DOT ne DOT jp>
<907ce1b4416a826cb07990dd601bd687 AT ispras DOT ru>
<20220410015753 DOT 753e2a238513eaf2a3da81e9 AT nifty DOT ne DOT jp>
<f55466cdda02fa46bc43174ba412df3a AT ispras DOT ru>
<20220410025410 DOT 196aa0a04368147dbbb31d3e AT nifty DOT ne DOT jp>
User-Agent: Roundcube Webmail/1.4.4
Message-ID: <afad32070411d6d94d5d94da90478af4@ispras.ru>
X-Sender: izbyshev AT ispras DOT ru
X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_00, DOS_RCVD_IP_TWICE_B,
KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP,
T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
Cc: cygwin AT cygwin DOT com
Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>

On 2022-04-09 20:54, Takashi Yano wrote:
> Thanks for checking. This seems to be normal. Then, I cannot
> understand why the ClosePseudoConsole() call is blocked...
> 
> The document by Microsoft mentions the blocking conditions of
> ClosePseudoConsole():
> https://docs.microsoft.com/en-us/windows/console/closepseudoconsole
> however, the thread above is draining the channel.

I've decided to check what object ClosePseudoConsole() waits for. The 
wait happens inside unexported KERNELBASE!_ClosePseudoConsoleMembers 
function. Here is the relevant part:

76589fb5 8b4e08          mov     ecx,dword ptr [esi+8]
76589fb8 e8c2fdffff      call    KERNELBASE!_HandleIsValid (76589d7f)
76589fbd 84c0            test    al,al
76589fbf 7456            je      
KERNELBASE!_ClosePseudoConsoleMembers+0x89 (7658a017)
76589fc1 8d45fc          lea     eax,[ebp-4]
76589fc4 895dfc          mov     dword ptr [ebp-4],ebx
76589fc7 50              push    eax
76589fc8 51              push    ecx
76589fc9 e8c23ef5ff      call    KERNELBASE!GetExitCodeProcess 
(764dde90)
76589fce 85c0            test    eax,eax
76589fd0 7414            je      
KERNELBASE!_ClosePseudoConsoleMembers+0x58 (76589fe6)
76589fd2 817dfc03010000  cmp     dword ptr [ebp-4],103h
76589fd9 750b            jne     
KERNELBASE!_ClosePseudoConsoleMembers+0x58 (76589fe6)
76589fdb 53              push    ebx
76589fdc 6aff            push    0FFFFFFFFh
76589fde ff7608          push    dword ptr [esi+8]
76589fe1 e8ba74f6ff      call    KERNELBASE!WaitForSingleObjectEx 
(764f14a0)

"esi" is the argument of ClosePseudoConsole(), so the first mov 
dereferences it with an offset and loads a process handle. Then, if this 
handle is valid, it calls GetExitCodeProcess(), and if it succeeds and 
returns STILL_ACTIVE, it waits for that process.

I've checked that hanging bash process has only 3 process handles: for 
itself, for dead javac, and for conhost.exe. So obviously it waits for 
the latter to terminate. (After I did all this, I realized there was 
much easier way to get this result via "Analyze wait chain" feature of 
Task Manager).

Unfortunately, I don't know anything about Windows consoles, but just in 
case I also checked what 5 threads of conhost.exe are waiting for:

1. Tries to enter a critical section (Task Manager claims it waits for 
thread 4, so probably the latter owns it).
2. Waits on a handle for "pty1-from-master-nat" named pipe.
3. Waits for an anonymous event.
4. Waits on a handle for "\Device\ConDrv" (in DeviceIoControl()).
5. Blocked in GetMessageW().

It's also worth of note that this conhost.exe seems to be the only one 
related to the Cygwin process tree (as well as the only related 
non-Cygwin process). All other conhost.exe processes were created before 
I started my stress test.

My guess is that this conhost.exe was created for a native app started 
from a Cygwin process. Could it be some race condition/bug that 
prevented conhost.exe from terminating once the native process (probably 
javac?) died?

Alexey

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019