Mail Archives: cygwin/2022/04/10/16:50:00
X-Recipient: | archive-cygwin AT delorie DOT com
|
X-Original-To: | cygwin AT cygwin DOT com
|
Delivered-To: | cygwin AT cygwin DOT com
|
DMARC-Filter: | OpenDMARC Filter v1.4.1 sourceware.org 8876C385DAA5
|
Authentication-Results: | sourceware.org;
|
| dmarc=pass (p=none dis=none) header.from=ispras.ru
|
Authentication-Results: | sourceware.org; spf=pass smtp.mailfrom=ispras.ru
|
MIME-Version: | 1.0
|
Date: | Sun, 10 Apr 2022 23:49:29 +0300
|
From: | Alexey Izbyshev <izbyshev AT ispras DOT ru>
|
To: | Takashi Yano <takashi DOT yano AT nifty DOT ne DOT jp>
|
Subject: | Re: Deadlock of the process tree when running make
|
In-Reply-To: | <0e1a53626639cb21369225ff9092ecfc@ispras.ru>
|
References: | <9388316255ada0e0fcb2d849cce5a894 AT ispras DOT ru>
|
| <20220409191743 DOT 6da2268a36e8c9b4ab22c722 AT nifty DOT ne DOT jp>
|
| <1ecd670b1cdff43e0b0d7e5ee4c9cfc5 AT ispras DOT ru>
|
| <ab3971adb8f441fd16bb62e480547a95 AT ispras DOT ru>
|
| <20220409204619 DOT dd0e53902d5e108ef462e510 AT nifty DOT ne DOT jp>
|
| <907ce1b4416a826cb07990dd601bd687 AT ispras DOT ru>
|
| <20220410015753 DOT 753e2a238513eaf2a3da81e9 AT nifty DOT ne DOT jp>
|
| <f55466cdda02fa46bc43174ba412df3a AT ispras DOT ru>
|
| <20220410025410 DOT 196aa0a04368147dbbb31d3e AT nifty DOT ne DOT jp>
|
| <afad32070411d6d94d5d94da90478af4 AT ispras DOT ru>
|
| <7204ed0aa2d6b3fcfb239010e6b67646 AT ispras DOT ru>
|
| <20220410163432 DOT 00dd7b9f81f8f322d97688f2 AT nifty DOT ne DOT jp>
|
| <0e1a53626639cb21369225ff9092ecfc AT ispras DOT ru>
|
User-Agent: | Roundcube Webmail/1.4.4
|
Message-ID: | <b937a782f8b8993e3d4a058a354596a7@ispras.ru>
|
X-Sender: | izbyshev AT ispras DOT ru
|
X-Spam-Status: | No, score=-0.0 required=5.0 tests=BAYES_00, DOS_RCVD_IP_TWICE_B,
|
| KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP,
|
| T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4
|
X-Spam-Checker-Version: | SpamAssassin 3.4.4 (2020-01-24) on
|
| server2.sourceware.org
|
X-BeenThere: | cygwin AT cygwin DOT com
|
X-Mailman-Version: | 2.1.29
|
List-Id: | General Cygwin discussions and problem reports <cygwin.cygwin.com>
|
List-Unsubscribe: | <https://cygwin.com/mailman/options/cygwin>,
|
| <mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
|
List-Archive: | <https://cygwin.com/pipermail/cygwin/>
|
List-Post: | <mailto:cygwin AT cygwin DOT com>
|
List-Help: | <mailto:cygwin-request AT cygwin DOT com?subject=help>
|
List-Subscribe: | <https://cygwin.com/mailman/listinfo/cygwin>,
|
| <mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
|
Cc: | cygwin AT cygwin DOT com
|
Errors-To: | cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
|
Sender: | "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>
|
On 2022-04-10 15:13, Alexey Izbyshev wrote:
> On 2022-04-10 10:34, Takashi Yano wrote:
>> On Sat, 09 Apr 2022 23:26:51 +0300
>> Thanks for investigating. In the normal case, conhost.exe is
>> terminated
>> when hWritePipe is closed.
>
> Thanks for confirming.
>
>>
>> Possibly, the hWritePipe has incorrect handle value.
>
> I've verified that the handle was correct by attaching via gdb to the
> hanging bash and checking that hWritePipe field is now zeroed (which
> happens only in the branch where _HandleIsValid returns true and
> hWritePipe is closed).
>
> I've found something interesting though. I've modeled a similar
> situation on another machine:
>
> 1. I've run a native process via bash.
> 2. I've attached to bash via gdb and set a breakpoint on
> ClosePseudoConsole().
> 3. I've killed the native process.
> 4. The breakpoint was hit, and I looked at hWritePipe value.
>
> ProcessHacker shows it as "Unnamed file: \FileSystem\Npfs". Both bash
> and conhost had a single handle with such name, and after I've
> forcibly closed it in the bash process (while it was still suspended
> by gdb), conhost.exe indeed died.
>
> Then I looked at the original hanging tree and found that the hanging
> bash.exe still has a single handle displayed as "Unnamed file:
> \FileSystem\Npfs". I don't know how to check what kernel object it
> refers to, but at least its access rights are the same as for
> hWritePipe that I've seen on another machine, and its handle count is
> 1. So could it be another copy of hWritePipe, e.g. due to some handle
> leak?
>
> I don't know how to verify whether this suspicious handle in bash.exe
> is paired with "Unnamed file: \FileSystem\Npfs" in conhost.exe, other
> than by forcibly closing it. If I close it and conhost.exe dies, it
> will confirm "the extra handle" theory, but will also prevent further
> investigation with the hanging tree. Do you have any advice?
>
I've found something that looked strange to me by checking handles in
the hanging process tree: the hanging conhost.exe and the hanging
bash.exe belong to different tests. Each test is a separate shell script
in a separate make recipe, so it looks like conhost.exe was created by
one test (which is still hanging at a later point in its script, trying
to run grep), but then bash.exe belonging to another test somehow got a
pseudoconsole referring to this conhost.exe and now hangs trying to
close it. So it looks that Cygwin migrated the pseudoconsole between
processes, and indeed fhandler_pty_slave::close_pseudoconsole() contains
something looking like migration logic. And this logic contains the
following call:
DuplicateHandle (GetCurrentProcess (),
ttyp->h_pcon_write_pipe,
new_owner, &new_write_pipe,
0, TRUE, DUPLICATE_SAME_ACCESS);
Is it safe to create an *inheritable* handle in another process here?
Could it be that the target process spawns a child at the wrong moment
(e.g. before it even knows about the newly created handle), and that
handle unintentionally leaks into the child, triggering the hang
afterwards?
A similarly suspicious code is also in
fhandler_pty_common::resize_pseudo_console():
DuplicateHandle (pcon_owner, get_ttyp ()->h_pcon_write_pipe,
GetCurrentProcess (), &hpcon_local.hWritePipe,
0, TRUE, DUPLICATE_SAME_ACCESS);
ResizePseudoConsole ((HPCON) &hpcon_local, size);
CloseHandle (pcon_owner);
CloseHandle (hpcon_local.hWritePipe);
If another thread spawns a child using
CreateProcess(bInheritHandles=TRUE) between DuplicateHandle() and
CloseHandle(hpcon_local.hWritePipe), the handle will leak into the
child.
Sorry if this is a false lead, I haven't tried to really understand the
pseudoconsole-related code yet.
Thanks,
Alexey
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
- Raw text -