DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 46QJvhqN512228 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=N7BiRdeR X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 29E133860740 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1722023861; bh=+tGFvYpmw2R60EoFHfa8MMoyU5FeYqvR61YAWQ9DZ2E=; h=Date:To:cc:Subject:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=N7BiRdeRB6ycubHm9t5POvczsP7XXVMH/Al7TSudVnERTW1IJwI8+W2D9JR6vcqrc Jzk7n1DX/uz4LStFtfwi/Ph6DAQkrU+8qlp4pO7bHL7mJplOo0z0cJ3+psshi1WtGq BAbARL6Ow/FWIrTiOR2LH5FbFlxNgZYu57AonMAE= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E19963858C50 ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E19963858C50 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722023809; cv=none; b=m1e2G5mLp2pgGjPXAS48p60RE/Q3z7v7hdso4Iyq0wXijJweHy7DlIQScLU+i4xG5tl4ylEKQsrz6A1D6rtJdVr7U22fhvfxEzlH1KNSXxwMYQaQOUupBA6zlgzwOy1zPJpw+WFOJD1ooJccm5FBRbjGn5r7fw71mUCRT3QXL5Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722023809; c=relaxed/simple; bh=ryeoW2zBGl1dJiviB2G/nR/DO3jWBieC7yRgWFXFRHg=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=ZpgQZh6c1gdNi5Erm5hijIaL0alP4If+BOVZoyjX9OVMDe84NQx0q8U0e0aJ8JtT1QcRcyPs7KXf1vdUQbFQmzAv5riclWa1GJGMZm8MA4753LuIv9S3BmCu9DYeWTNTEo7cR03977ofOD0ivZGRb7sIwSKiHJsZMDlgu4ZPckw= ARC-Authentication-Results: i=1; server2.sourceware.org Date: Fri, 26 Jul 2024 12:56:44 -0700 (PDT) X-X-Sender: jeremyd AT resin DOT csoft DOT net To: cygwin AT cygwin DOT com cc: Johannes DOT Schindelin AT gmx DOT de Subject: Re: double-fork issue on Windows on ARM64 In-Reply-To: <0b48aa79-8579-882e-a40b-724267936788@jdrake.com> Message-ID: References: <78f294de-4c94-242a-722e-fd98e51edff9 AT jdrake DOT com> <23f23b0a-e60e-e3ff-4c1e-295599fdc813 AT jdrake DOT com> <0b48aa79-8579-882e-a40b-724267936788 AT jdrake DOT com> MIME-Version: 1.0 X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Jeremy Drake via Cygwin Reply-To: Jeremy Drake Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Replying to cygwin AT cygwin DOT com list due to determination that this thread was incorrectly sent to cygwin-developers. I apologize for the inconvenience. On Tue, 21 May 2024, Jeremy Drake wrote: > On Mon, 20 May 2024, Jeremy Drake wrote: > > > Today, I was attempting to look at the TerminateThread situation. The > > call in question comes from the attempt to terminate the wait_thread of a > > chld_procs entry. I noticed elsewhere in cygwin code (flock.cc) that > > CancelSynchronousIo was being called, and that stood out to me because > > chances are that the wait thread (if running) is going to be blocked in > > ReadFile. I am testing with the following hack, and so far have not seen > > a hang > > > I left my reproducer running with this hack, and I did eventually get an > error exit from the intermediate subprocess, which seems to have been a > signal 11 (if I'm reading the status from waitpid correctly). > > What I noticed today is that in pinfo.cc, near the end of proc_waiter, it > sets vchild.wait_thread = NULL;. If my reading of this is correct, that > does nothing useful, because vchild is a stack variable there and the > function returns soon after. I that what that *intended* to do was to > NULL out the wait_thread pointer that would be checked in proc_terminate, > but there's no guarantee that the entry in chld_procs is in the same place > at the end of proc_waiter as it was at the start (so arg may point to > some other pinfo entirely). > > Does any of this make any sense, or am I barking up the wrong tree here? I tried adding a new "procstuff" enum member and case to proc_subproc to loop through all chld_procs and NULL out wait_threads that equal the pointer passed in as "val" This was called from the proc_waiter before it NULLed out its local copy of vchild. This did not help. I am still fairly sure that somehow this terminate_thread call is where the hang is happening, but I'm not seeing why. This is definitely not helped by the fact that debuggers cannot attach to the main thread for some reason. -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple