DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 46QJvhqN512228
Authentication-Results: delorie.com;
	dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=N7BiRdeR
X-Recipient: archive-cygwin@delorie.com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 29E133860740
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
	s=default; t=1722023861;
	bh=+tGFvYpmw2R60EoFHfa8MMoyU5FeYqvR61YAWQ9DZ2E=;
	h=Date:To:cc:Subject:In-Reply-To:References:List-Id:
	 List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe:
	 From:Reply-To:From;
	b=N7BiRdeRB6ycubHm9t5POvczsP7XXVMH/Al7TSudVnERTW1IJwI8+W2D9JR6vcqrc
	 Jzk7n1DX/uz4LStFtfwi/Ph6DAQkrU+8qlp4pO7bHL7mJplOo0z0cJ3+psshi1WtGq
	 BAbARL6Ow/FWIrTiOR2LH5FbFlxNgZYu57AonMAE=
X-Original-To: cygwin@cygwin.com
Delivered-To: cygwin@cygwin.com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E19963858C50
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E19963858C50
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722023809; cv=none;
 b=m1e2G5mLp2pgGjPXAS48p60RE/Q3z7v7hdso4Iyq0wXijJweHy7DlIQScLU+i4xG5tl4ylEKQsrz6A1D6rtJdVr7U22fhvfxEzlH1KNSXxwMYQaQOUupBA6zlgzwOy1zPJpw+WFOJD1ooJccm5FBRbjGn5r7fw71mUCRT3QXL5Y=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1722023809; c=relaxed/simple;
 bh=ryeoW2zBGl1dJiviB2G/nR/DO3jWBieC7yRgWFXFRHg=;
 h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version;
 b=ZpgQZh6c1gdNi5Erm5hijIaL0alP4If+BOVZoyjX9OVMDe84NQx0q8U0e0aJ8JtT1QcRcyPs7KXf1vdUQbFQmzAv5riclWa1GJGMZm8MA4753LuIv9S3BmCu9DYeWTNTEo7cR03977ofOD0ivZGRb7sIwSKiHJsZMDlgu4ZPckw=
ARC-Authentication-Results: i=1; server2.sourceware.org
Date: Fri, 26 Jul 2024 12:56:44 -0700 (PDT)
X-X-Sender: jeremyd@resin.csoft.net
To: cygwin@cygwin.com
cc: Johannes.Schindelin@gmx.de
Subject: Re: double-fork issue on Windows on ARM64
In-Reply-To: <0b48aa79-8579-882e-a40b-724267936788@jdrake.com>
Message-ID: <a3ec694e-e9d5-04db-fa21-b93b28f7a0d4@jdrake.com>
References: <78f294de-4c94-242a-722e-fd98e51edff9@jdrake.com>
 <23f23b0a-e60e-e3ff-4c1e-295599fdc813@jdrake.com>
 <0b48aa79-8579-882e-a40b-724267936788@jdrake.com>
MIME-Version: 1.0
X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, SPF_HELO_PASS,
 SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: cygwin@cygwin.com
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-request@cygwin.com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=subscribe>
From: Jeremy Drake via Cygwin <cygwin@cygwin.com>
Reply-To: Jeremy Drake <cygwin@jdrake.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: cygwin-bounces~archive-cygwin=delorie.com@cygwin.com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie.com@cygwin.com>

Replying to cygwin@cygwin.com list due to determination that this thread
was incorrectly sent to cygwin-developers.  I apologize for the
inconvenience.

On Tue, 21 May 2024, Jeremy Drake wrote:

> On Mon, 20 May 2024, Jeremy Drake wrote:
>
> > Today, I was attempting to look at the TerminateThread situation.  The
> > call in question comes from the attempt to terminate the wait_thread of a
> > chld_procs entry.  I noticed elsewhere in cygwin code (flock.cc) that
> > CancelSynchronousIo was being called, and that stood out to me because
> > chances are that the wait thread (if running) is going to be blocked in
> > ReadFile.  I am testing with the following hack, and so far have not seen
> > a hang
>
>
> I left my reproducer running with this hack, and I did eventually get an
> error exit from the intermediate subprocess, which seems to have been a
> signal 11 (if I'm reading the status from waitpid correctly).
>
> What I noticed today is that in pinfo.cc, near the end of proc_waiter, it
> sets vchild.wait_thread = NULL;.  If my reading of this is correct, that
> does nothing useful, because vchild is a stack variable there and the
> function returns soon after.  I that what that *intended* to do was to
> NULL out the wait_thread pointer that would be checked in proc_terminate,
> but there's no guarantee that the entry in chld_procs is in the same place
> at the end of proc_waiter as it was at the start (so arg may point to
> some other pinfo entirely).
>
> Does any of this make any sense, or am I barking up the wrong tree here?

I tried adding a new "procstuff" enum member and case to proc_subproc to
loop through all chld_procs and NULL out wait_threads that equal the
pointer passed in as "val"  This was called from the proc_waiter before
it NULLed out its local copy of vchild.  This did not help.

I am still fairly sure that somehow this terminate_thread call is where
the hang is happening, but I'm not seeing why.  This is definitely not
helped by the fact that debuggers cannot attach to the main thread for
some reason.

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple
