DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 58JKbOEY3871364 Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 58JKbOEY3871364 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=EyldqXOn X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 277343858401 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1758314242; bh=U39w9kGAjh13sQbl0ow5QJODCXIkcMY0IBll0UKFSmY=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=EyldqXOnbFJv5wmx0egKIUaNxbKjEx4+o7/1RyQdueFL6nNa3xVoJsQoafvkPc/bs b5RBKUBtbOu9O0QsO3rWL5ASWl6mNZ5vxm2lXSRSqL3Q6UZBGToV7QffLFjgTaqA2J vVH8ydp7KdxFT8NDntHPta8iypmQkou1QSWRG/20= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0E52B3858D21 ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0E52B3858D21 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1758314213; cv=none; b=M9mfz16bX3nXpmfNctmu09fTXuwo+LozkXG6gLqxvQFTlNCzIhPm6YdfgjBoPTC2VwATo7iIoV5dPj7CP4XCWCeBeOo/2WCy+wG0RnzgnV32xBT/QxE2PgZnbAye/fTJ3Co6qs4MapiKSikUsP735x5/d/0OrNkm6ZXEG5gPazc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1758314213; c=relaxed/simple; bh=tiLUIddLI48J1mwDTtQcel/iQYb0TRNsOd4Xu47/Ivs=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=QIDWKwjjzmo2YiQIyccFUUn0qYPWnZi5jdLjUR/c0xmdwI1+LrIASJaiMI8JCdNIb8bHZhXRFRp/IBGR3qNSs6PZCPmTMfTDJRQMYJhxWVF7qaf1f+W/625KDrbeuq000HPNi95xCEJuNL17gnmaS1teSWAPhTGQ5tEMWFZQ2fA= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0E52B3858D21 Date: Fri, 19 Sep 2025 13:36:52 -0700 (PDT) X-X-Sender: jeremyd AT resin DOT csoft DOT net To: cygwin AT cygwin DOT com Subject: debugging hangs in high -j rust builds Message-ID: <18aae124-c0b4-5ec1-a0ca-eaa3d7121d01@jdrake.com> MIME-Version: 1.0 X-Content-Filtered-By: Mailman/MimeDel 2.1.30 X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 List-Id: General Cygwin discussions and problem reports List-Archive: List-Post: List-Help: List-Subscribe: , From: Jeremy Drake via Cygwin Reply-To: Jeremy Drake Content-Type: text/plain; charset="utf-8" Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 58JKbOEY3871364 I'm trying to debug hangs during rust builds with high -j values. 1. there's a hung "rustc.exe", which I cannot attach gdb to with gdb --pid. I attached with windbg, and it timed out and used some alternate means to attach because the loader lock was held. Is there some way to get gdb to attach in that way? What windbg showed me is that this rustc is stuck in cygheap_fixup_in_child, in an infinite loop walking through the cygheap->chain. It appears to be stuck on a value 0x0000000800043710 whose prev is 0x0000000800043920 whose prev is 0x0000000800043710 ... There are invisible (to ps) cargo.exe processes. One of them shows up in pstree as ?(1)─┬─?(2287)───rustc Attaching to 2287 with gdb shows a fairly normal cargo.exe with a bunch of threads, but ultimately waiting on a child. The other cargo.exe I was able to attach to with gdb --pid but giving the winpid (didn't know I could do that but gave it a try and it worked!) Here's where that one is: (gdb) bt #0 0x00007ffa705917a4 in ntdll!ZwWaitForMultipleObjects () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll #1 0x00007ffa6dc36329 in WaitForMultipleObjectsEx () from /cygdrive/c/Windows/System32/KERNELBASE.dll #2 0x00007ffa6dc3622e in WaitForMultipleObjects () from /cygdrive/c/Windows/System32/KERNELBASE.dll #3 0x00007ff9ae662a46 in child_info::sync (this=0x7ff9ae84b940 , this AT entry=0x0, pid=8220, hProcess=@0x7ffa0ac40: 0x63a8, howlong=howlong AT entry=4294967295) at ../../../../winsup/cygwin/sigproc.cc:1138 #4 0x00007ff9ae666e51 in child_info_spawn::worker (this=, this AT entry=0x7ff9ae84b940 , mode=3, mode AT entry=4099, prog_arg=prog_arg AT entry=0x80003fac0 "/home/user/rust/rust-1.90.0-1.x86_64/build/build-Cygwin/bootstrap/debug/rustc", args=...) at ../../../../winsup/cygwin/spawn.cc:924 #5 0x00007ff9ae66851a in spawnve (mode=4099, path=0x80003fac0 "/home/user/rust/rust-1.90.0-1.x86_64/build/build-Cygwin/bootstrap/debug/rustc", argv=0xa0bb2b100, envp=) at ../../../../winsup/cygwin/spawn.cc:1033 #6 0x00007ff9ae603be0 in execvp (file=, argv=0xa0bb2b100) at ../../../../winsup/cygwin/exec.cc:94 #7 0x00007ff9ae73cc64 in _sigfe () at sigfe.s:35 #8 0x000000010152c72c in std::sys::process::unix::unix::::do_exec () #9 0x000000010152c2e8 in std::sys::process::unix::unix::::spawn () #10 0x0000000100f63e29 in ::exec_with_streaming () #11 0x000000010078479c in ::exec () #12 0x00000001007691a7 in cargo::core::compiler::rustc::{closure#3} () #13 0x0000000100aa5db5 in <::then::{closure#0} as core::ops::function::FnOnce<(&cargo::core::compiler::job_queue::job_state::JobState,)>>::call_once::{shim:vtable#0} () #14 0x0000000100aa5db5 in <::then::{closure#0} as core::ops::function::FnOnce<(&cargo::core::compiler::job_queue::job_state::JobState,)>>::call_once::{shim:vtable#0} () #15 0x0000000100ab9791 in ::run_to_finish () #16 0x00000001009f5dd3 in std::sys::backtrace::__rust_begin_short_backtrace::<::run::{closure#1}, ()> () #17 0x00000001009fd80b in <::spawn_unchecked_<::run::{closure#1}, ()>::{closure#1} as core::ops::function::FnOnce<()>>::call_once::{shim:vt --Type for more, q to quit, c to continue without paging--q Quit (gdb) define plist Type commands for definition of "plist". End with a line saying just "end". >set var $n = $arg0 >while $n != 0x0 >printf "%p\n", $n >set var $n = $n->prev >end >end (gdb) plist cygheap->chain 0x800045370 0x800045330 0x800045260 0x8000451f0 0x800045180 0x800045140 0x8000450b0 0x800045080 0x800045030 0x800044fe0 0x800044fa0 0x800044f60 0x800044f20 0x800044ed0 0x800044e60 0x800044e10 0x800044dd0 0x800044d90 0x800044d50 0x800044d20 0x800044cf0 0x800044cc0 0x800044c50 0x800044ac0 0x800044a50 0x8000448c0 0x800044830 0x8000447a0 0x800044710 0x800044680 0x800044570 0x800044260 0x8000441f0 0x800044180 0x800044110 0x800044040 0x800043fd0 0x800043dc0 0x800043cb0 0x800043c40 0x800043bd0 0x800043b00 0x800043a30 0x800043920 0x800043710 0x800043920 0x800043710 0x800043920 ... I was thinking maybe the cygheap needed to be locked during spawn (I was looking at 3.6 code), but it was added to be locked in main already and I was running a 3.7.0 snapshot that contained that fix. Maybe it needs to be locked during fork also? I don't know how else the chain would be circular like that... The chain in the waiting parent is fine and contains 43710 and 43920 ... 0x800043b30 0x800043920 0x800043710 0x800043500 0x8000432f0 ... fork.cc does contain a refresh_cygheap in frok::parent -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple