delorie.com/archives/browse.cgi | search |
X-Recipient: | archive-cygwin AT delorie DOT com |
X-Original-To: | cygwin AT cygwin DOT com |
Delivered-To: | cygwin AT cygwin DOT com |
DMARC-Filter: | OpenDMARC Filter v1.4.1 sourceware.org 50FBA385734C |
Authentication-Results: | sourceware.org; |
dmarc=fail (p=none dis=none) header.from=nifty.ne.jp | |
Authentication-Results: | sourceware.org; spf=fail smtp.mailfrom=nifty.ne.jp |
DKIM-Filter: | OpenDKIM Filter v2.10.3 conssluserg-06.nifty.com 23B8ZDbo016206 |
DKIM-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=nifty.ne.jp; |
s=dec2015msa; t=1649666114; | |
bh=fS4j3QsOx/xj41NF5ijA3kkLSEKxU6eZ5c+POZH4zfI=; | |
h=Date:From:To:Cc:Subject:In-Reply-To:References:From; | |
b=CKu6UimioBsx8mj12oz15rUJetg6VGG590FcHzDTl1rzJip3CkQf7EGEJHBaXHzWM | |
v5PskYkKn6csWkgNwPvzNXAjCPlcGYfCHkVV2csJLKNOCfLlJB5+fBT1104D0twTqs | |
OT6cH/OOkQQkdiL/dlKsrjzbC+JMtlMaICiNckzN8pK7UyIPOYj0UAiMoFfy5b2p4V | |
7KoXXWmoEVSaLeTHEXowFp2jH4IN2ZtUDG7AfAdmiN7gMlvRLu3mCLDHs4CBwnUiYC | |
EPilNLSLFx0y8nSrDU5R65pyC512Hxp70dc4RDPSndWE1VYCYnSme09N8uHIm+aQpc | |
QyI5YkUmzX1jw== | |
X-Nifty-SrcIP: | [119.150.44.95] |
Date: | Mon, 11 Apr 2022 17:35:26 +0900 |
From: | Takashi Yano <takashi DOT yano AT nifty DOT ne DOT jp> |
To: | cygwin AT cygwin DOT com |
Subject: | Re: Deadlock of the process tree when running make |
Message-Id: | <20220411173526.6243b9492e0fc3d4132a58a8@nifty.ne.jp> |
In-Reply-To: | <b937a782f8b8993e3d4a058a354596a7@ispras.ru> |
References: | <9388316255ada0e0fcb2d849cce5a894 AT ispras DOT ru> |
<20220409191743 DOT 6da2268a36e8c9b4ab22c722 AT nifty DOT ne DOT jp> | |
<1ecd670b1cdff43e0b0d7e5ee4c9cfc5 AT ispras DOT ru> | |
<ab3971adb8f441fd16bb62e480547a95 AT ispras DOT ru> | |
<20220409204619 DOT dd0e53902d5e108ef462e510 AT nifty DOT ne DOT jp> | |
<907ce1b4416a826cb07990dd601bd687 AT ispras DOT ru> | |
<20220410015753 DOT 753e2a238513eaf2a3da81e9 AT nifty DOT ne DOT jp> | |
<f55466cdda02fa46bc43174ba412df3a AT ispras DOT ru> | |
<20220410025410 DOT 196aa0a04368147dbbb31d3e AT nifty DOT ne DOT jp> | |
<afad32070411d6d94d5d94da90478af4 AT ispras DOT ru> | |
<7204ed0aa2d6b3fcfb239010e6b67646 AT ispras DOT ru> | |
<20220410163432 DOT 00dd7b9f81f8f322d97688f2 AT nifty DOT ne DOT jp> | |
<0e1a53626639cb21369225ff9092ecfc AT ispras DOT ru> | |
<b937a782f8b8993e3d4a058a354596a7 AT ispras DOT ru> | |
X-Mailer: | Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32) |
Mime-Version: | 1.0 |
X-Spam-Status: | No, score=-5.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, |
DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, | |
SPF_HELO_NONE, SPF_PASS, TXREP, | |
T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 | |
X-Spam-Checker-Version: | SpamAssassin 3.4.4 (2020-01-24) on |
server2.sourceware.org | |
X-BeenThere: | cygwin AT cygwin DOT com |
X-Mailman-Version: | 2.1.29 |
List-Id: | General Cygwin discussions and problem reports <cygwin.cygwin.com> |
List-Unsubscribe: | <https://cygwin.com/mailman/options/cygwin>, |
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe> | |
List-Archive: | <https://cygwin.com/pipermail/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-request AT cygwin DOT com?subject=help> |
List-Subscribe: | <https://cygwin.com/mailman/listinfo/cygwin>, |
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe> | |
Cc: | Alexey Izbyshev <izbyshev AT ispras DOT ru> |
Errors-To: | cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com |
Sender: | "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com> |
On Sun, 10 Apr 2022 23:49:29 +0300 Alexey Izbyshev wrote: > On 2022-04-10 15:13, Alexey Izbyshev wrote: > > On 2022-04-10 10:34, Takashi Yano wrote: > >> On Sat, 09 Apr 2022 23:26:51 +0300 > >> Thanks for investigating. In the normal case, conhost.exe is > >> terminated > >> when hWritePipe is closed. > > > > Thanks for confirming. > > > >> > >> Possibly, the hWritePipe has incorrect handle value. > > > > I've verified that the handle was correct by attaching via gdb to the > > hanging bash and checking that hWritePipe field is now zeroed (which > > happens only in the branch where _HandleIsValid returns true and > > hWritePipe is closed). > > > > I've found something interesting though. I've modeled a similar > > situation on another machine: > > > > 1. I've run a native process via bash. > > 2. I've attached to bash via gdb and set a breakpoint on > > ClosePseudoConsole(). > > 3. I've killed the native process. > > 4. The breakpoint was hit, and I looked at hWritePipe value. > > > > ProcessHacker shows it as "Unnamed file: \FileSystem\Npfs". Both bash > > and conhost had a single handle with such name, and after I've > > forcibly closed it in the bash process (while it was still suspended > > by gdb), conhost.exe indeed died. > > > > Then I looked at the original hanging tree and found that the hanging > > bash.exe still has a single handle displayed as "Unnamed file: > > \FileSystem\Npfs". I don't know how to check what kernel object it > > refers to, but at least its access rights are the same as for > > hWritePipe that I've seen on another machine, and its handle count is > > 1. So could it be another copy of hWritePipe, e.g. due to some handle > > leak? > > > > I don't know how to verify whether this suspicious handle in bash.exe > > is paired with "Unnamed file: \FileSystem\Npfs" in conhost.exe, other > > than by forcibly closing it. If I close it and conhost.exe dies, it > > will confirm "the extra handle" theory, but will also prevent further > > investigation with the hanging tree. Do you have any advice? > > > I've found something that looked strange to me by checking handles in > the hanging process tree: the hanging conhost.exe and the hanging > bash.exe belong to different tests. Each test is a separate shell script > in a separate make recipe, so it looks like conhost.exe was created by > one test (which is still hanging at a later point in its script, trying > to run grep), but then bash.exe belonging to another test somehow got a > pseudoconsole referring to this conhost.exe and now hangs trying to > close it. So it looks that Cygwin migrated the pseudoconsole between > processes, and indeed fhandler_pty_slave::close_pseudoconsole() contains > something looking like migration logic. And this logic contains the > following call: > > DuplicateHandle (GetCurrentProcess (), > ttyp->h_pcon_write_pipe, > new_owner, &new_write_pipe, > 0, TRUE, DUPLICATE_SAME_ACCESS); > > Is it safe to create an *inheritable* handle in another process here? > Could it be that the target process spawns a child at the wrong moment > (e.g. before it even knows about the newly created handle), and that > handle unintentionally leaks into the child, triggering the hang > afterwards? Thanks for finding that! As you pointed out, hWritePipe should not be inheritable. That might be the cause. A countermeasure version is available at the following location: https://tyan0.yr32.net/cygwin/x86/test/cygwin1-20220411.dll.xz https://tyan0.yr32.net/cygwin/x86_64/test/cygwin1-20220411.dll.xz Could you please test? To keep the hanging tree, please install cygwin another directory, and replace cygwin1.dll with the countermeasure version. If you want to setup another sshd, please use the command such as: ssh-host-config --name cygsshd2 --port 2222 To remove sshd installed using above command: cygrunsrv -E cygsshd2 cygrunsrv -R cygsshd2 -- Takashi Yano <takashi DOT yano AT nifty DOT ne DOT jp> -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |