delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2022/04/16/09:21:56

X-Recipient: archive-cygwin AT delorie DOT com
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 31767385803E
Authentication-Results: sourceware.org;
dmarc=pass (p=none dis=none) header.from=ispras.ru
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ispras.ru
MIME-Version: 1.0
Date: Sat, 16 Apr 2022 16:21:34 +0300
From: Alexey Izbyshev <izbyshev AT ispras DOT ru>
To: Takashi Yano <takashi DOT yano AT nifty DOT ne DOT jp>
Subject: Re: Deadlock of the process tree when running make
In-Reply-To: <20220416183910.b532b2cc95725b508bfd0991@nifty.ne.jp>
References: <9388316255ada0e0fcb2d849cce5a894 AT ispras DOT ru>
<20220409191743 DOT 6da2268a36e8c9b4ab22c722 AT nifty DOT ne DOT jp>
<1ecd670b1cdff43e0b0d7e5ee4c9cfc5 AT ispras DOT ru>
<ab3971adb8f441fd16bb62e480547a95 AT ispras DOT ru>
<20220409204619 DOT dd0e53902d5e108ef462e510 AT nifty DOT ne DOT jp>
<907ce1b4416a826cb07990dd601bd687 AT ispras DOT ru>
<20220410015753 DOT 753e2a238513eaf2a3da81e9 AT nifty DOT ne DOT jp>
<f55466cdda02fa46bc43174ba412df3a AT ispras DOT ru>
<20220410025410 DOT 196aa0a04368147dbbb31d3e AT nifty DOT ne DOT jp>
<afad32070411d6d94d5d94da90478af4 AT ispras DOT ru>
<7204ed0aa2d6b3fcfb239010e6b67646 AT ispras DOT ru>
<20220410163432 DOT 00dd7b9f81f8f322d97688f2 AT nifty DOT ne DOT jp>
<0e1a53626639cb21369225ff9092ecfc AT ispras DOT ru>
<b937a782f8b8993e3d4a058a354596a7 AT ispras DOT ru>
<20220411173526 DOT 6243b9492e0fc3d4132a58a8 AT nifty DOT ne DOT jp>
<ab8ded5fb5dad09dc2aebe5b49aa7dac AT ispras DOT ru>
<1bdd5ac77277343fbff9b560fa98b15e AT ispras DOT ru>
<f25d76d5897f60ab1a5a52bd0dffd484 AT ispras DOT ru>
<20220416183910 DOT b532b2cc95725b508bfd0991 AT nifty DOT ne DOT jp>
User-Agent: Roundcube Webmail/1.4.4
Message-ID: <45f9160a597b25bc576eb153a138fb88@ispras.ru>
X-Sender: izbyshev AT ispras DOT ru
X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_00, DOS_RCVD_IP_TWICE_B,
KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP,
T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
Cc: cygwin AT cygwin DOT com
Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>

On 2022-04-16 12:39, Takashi Yano wrote:
> I am not sure yet what is essential, but the current code closes
> pseudo console only if there is no other process which is attaching
> to the pseudo console. I wonder why javac.exe is remaining as
> zombie. The parent bash.exe calls ColosePseudoConsole() when
> child non-cygwin app is terminated, i.e., after WaitForSingleObject()
> for child process handle returns.
> https://www.cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/spawn.cc;h=81dba5a941e919ea2514013069aef22c6fad8004;hb=7ac0767053e278f0ce9811bf6f77278bd2f49c20#l1009
> 
> What does the "zombie" mean? Is it listed in the process list of
> ProcessHacker? I still suspect that the zombie javac.exe holds
> the  hWritePipe handle leaked from parent bash.exe.
> 
By "zombie" I meant the same thing as in the Linux kernel: a data 
structure that remains after a process terminated, but hasn't been 
waited for yet (I don't know how this is implemented in Cygwin). So 
there is no javac.exe process in ProcessHacker, but "ps" and similar 
tools in Cygwin still list "javac".

I'm now trying to create a small reproducer that I can share, and I've 
had a first small success this night: I could get a very similar hang 
with a simple Makefile and a script with Cygwin 3.3.4. Here is the tree:

make(14479)-+-bash(14484)---bash(14611)
             |-bash(14515)---bash(14618)
             |-bash(14491)---bash(14500)---bash(14612)
             |-bash(14501)---bash(14510)---bash(14605)
             |-bash(14505)---bash(14607)
             |-bash(14494)---bash(14617)
             |-bash(14506)---bash(14513)---bash(14610)
             |-bash(14512)---bash(14518)---bash(14615)
             |-bash(14486)---bash(14495)---bash(14606)
             |-bash(14483)---bash(14490)---bash(14609)
             |-bash(14509)---bash(14614)
             |-bash(14489)---bash(14608)
             |-bash(14499)---bash(14613)
             |-bash(14481)---bash(14485)---python(14588)
             |-bash(14496)---bash(14504)---bash(14616)
             `-bash(14482)---bash(14604)


"python" is a zombie, just as "javac" is in the original case. There is 
also a single "conhost.exe" again, and all of its 5 threads are doing 
the same things as in the original case (including the signal pipe 
thread trying to EnterCriticalSection()). The only difference is that 
leaf bash.exe are trying to acquire pcon mutex at a different point [1], 
but I guess this difference is not important.

I'll try this reproducer with your patched DLL as well as on another 
machine and share it in case of success.

Thanks,
Alexey

[1] 
https://www.cygwin.com/git?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/spawn.cc;h=81dba5a941e919ea2514013069aef22c6fad8004;hb=cygwin-3_3_4-release#l697

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019