delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2022/04/13/12:48:50

X-Recipient: archive-cygwin AT delorie DOT com
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2DCB23858C53
Authentication-Results: sourceware.org;
dmarc=pass (p=none dis=none) header.from=ispras.ru
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ispras.ru
MIME-Version: 1.0
Date: Wed, 13 Apr 2022 19:48:04 +0300
From: Alexey Izbyshev <izbyshev AT ispras DOT ru>
To: Takashi Yano <takashi DOT yano AT nifty DOT ne DOT jp>
Subject: Re: Deadlock of the process tree when running make
In-Reply-To: <ab8ded5fb5dad09dc2aebe5b49aa7dac@ispras.ru>
References: <9388316255ada0e0fcb2d849cce5a894 AT ispras DOT ru>
<20220409191743 DOT 6da2268a36e8c9b4ab22c722 AT nifty DOT ne DOT jp>
<1ecd670b1cdff43e0b0d7e5ee4c9cfc5 AT ispras DOT ru>
<ab3971adb8f441fd16bb62e480547a95 AT ispras DOT ru>
<20220409204619 DOT dd0e53902d5e108ef462e510 AT nifty DOT ne DOT jp>
<907ce1b4416a826cb07990dd601bd687 AT ispras DOT ru>
<20220410015753 DOT 753e2a238513eaf2a3da81e9 AT nifty DOT ne DOT jp>
<f55466cdda02fa46bc43174ba412df3a AT ispras DOT ru>
<20220410025410 DOT 196aa0a04368147dbbb31d3e AT nifty DOT ne DOT jp>
<afad32070411d6d94d5d94da90478af4 AT ispras DOT ru>
<7204ed0aa2d6b3fcfb239010e6b67646 AT ispras DOT ru>
<20220410163432 DOT 00dd7b9f81f8f322d97688f2 AT nifty DOT ne DOT jp>
<0e1a53626639cb21369225ff9092ecfc AT ispras DOT ru>
<b937a782f8b8993e3d4a058a354596a7 AT ispras DOT ru>
<20220411173526 DOT 6243b9492e0fc3d4132a58a8 AT nifty DOT ne DOT jp>
<ab8ded5fb5dad09dc2aebe5b49aa7dac AT ispras DOT ru>
User-Agent: Roundcube Webmail/1.4.4
Message-ID: <1bdd5ac77277343fbff9b560fa98b15e@ispras.ru>
X-Sender: izbyshev AT ispras DOT ru
X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_00, DOS_RCVD_IP_TWICE_B,
KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP,
T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
Cc: cygwin AT cygwin DOT com
Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>

On 2022-04-11 13:10, Alexey Izbyshev wrote:
> On 2022-04-11 11:35, Takashi Yano wrote:
>> On Sun, 10 Apr 2022 23:49:29 +0300
>> A countermeasure version is available at the following location:
>> https://tyan0.yr32.net/cygwin/x86/test/cygwin1-20220411.dll.xz
>> https://tyan0.yr32.net/cygwin/x86_64/test/cygwin1-20220411.dll.xz
>> 
>> Could you please test? To keep the hanging tree, please install
>> cygwin another directory, and replace cygwin1.dll with the
>> countermeasure version.
>> 
> Thank you for providing the binaries! I've started testing in a
> separate cygwin installation on the same machine, as you suggested.
> The hang previously took many hours to reproduce, so I'll keep tests
> running for a while and then report back.
> 
The good news is that the tests have been running for two days so far 
without any cygwin-related issues, so the patched version doesn't seem 
to introduce new issues.

The bad news is my theory about the suspicious "Unnamed file: 
\FileSystem\Npfs" in the hanging bash.exe being a leak seems to be 
wrong. I've closed that handle, but conhost.exe hasn't unblocked. All of 
its threads are doing the same things as before:

1. Tries to enter a critical section. (Task Manager claims it waits for
thread 4, so probably the latter owns it).
2. ReadFile("pty1-from-master-nat" named pipe)
3. Waits for an anonymous event.
4. Waits on a handle for "\Device\ConDrv" (in DeviceIoControl()).
5. Blocked in GetMessageW().

I've created a model situation with bash.exe stopped at a breakpoint in 
ClosePseudoConsole() at another machine again, and it seems that the 
last time I missed that bash.exe contains *two* handles for (different) 
"Unnamed file: \FileSystem\Npfs" here too, so it seems to be normal.

What's probably not normal is the behavior of the hanging conhost.exe. 
I've compared the points where conhost.exe is blocked, and all but one 
threads in the model case are doing the same things as in the hanging 
case, but the remaining thread is blocked in 
ReadFile("\Device\NamedPipe\") (i.e. the read end of "hWritePipe" of 
pcon) instead of trying to enter a critical section like thread 1 above. 
So now I'm starting to doubt that it's a cygwin bug and not some 
conhost.exe bug.

I'll try to poke around the hanging conhost.exe some more, and also may 
be will try to create a faster reproducer.

Thanks for your help so far,
Alexey

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019