DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 5A5Cl7Wh2309098 Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 5A5Cl7Wh2309098 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=M0scZVTH X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0C4F138560A1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1762346824; bh=czNg1mhztS7kZBHX06AoOmj4q9OfEQCy9hCdrx/Fw7A=; h=Date:To:Subject:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=M0scZVTHGriSKtsukLquZzpgwhigTjHAFkPTU46jQZf4AueDLX9gWmhF0DyCgxGia W3G+tAmmzq0iCFyL6OrSRPvq/XSkCwpwyL2trcd2NNxq/W+LTo6QtNvRARvyut3iQB Jo2vXqRJZ4hD5Rfz4oGtCEug1a9dvm1W29wXBgy0= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 909E4385B51F ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 909E4385B51F ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1762346794; cv=none; b=ktFhW4Jrjv4kyWuoh7CzJM8KLFn7R2Yj8ln4asQUODQyM9YAQxoxStFapBVdWgFBlRiaabTz9GIT84dqvyL5BY/2lhUp6JEVoGrRz4hBG3be5CxSLnB7NP8iL1kKaRue4DTx3fW0H+Qd6+krJqCNhbDzL24LdwoRC1wLbuku9cE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1762346794; c=relaxed/simple; bh=+AMhPXFSGJduKm4v6+r4v2QO/8ninKRjQ5NKaL+7nNw=; h=Date:From:To:Subject:Message-Id:Mime-Version:DKIM-Signature; b=ch8hlHCA43P0ryUV1pZKEYP315sEI91tfxL6y2JzB0RLARWi1hwox2/SPOUYYeSYTInhJx1LgFn5cmLpUEh06FgS0LMYKhbIAHkJH44wtk1ydDEqzhcHlOrMARS3Wn3KNm9tXsGqRIqmVH3XGPfbXauVgcw0bAcKsuNL6DZjww0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 909E4385B51F Date: Wed, 5 Nov 2025 21:46:30 +0900 To: cygwin AT cygwin DOT com Subject: Re: Deadlock in cygwin1.dll when joining a thread before thread_local initialization Message-Id: <20251105214630.1d9b42dffccd6413f1463dd0@nifty.ne.jp> In-Reply-To: <1761818755428.1050851142.2466019548@ezweb.ne.jp> References: <1761818755428 DOT 1050851142 DOT 2466019548 AT ezweb DOT ne DOT jp> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32) Mime-Version: 1.0 X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Takashi Yano via Cygwin Reply-To: Takashi Yano Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" On Thu, 30 Oct 2025 11:47:05 +0000 Tomohiro Kashiwada wrote: > Hello, > > > I found a deadlock in cygwin1.dll that occurs during process cleanup. > > It happens when the following conditions are met: > - A thread is launched that will initialize the thread_local slot for a > variable whose destructor is non-trivial. > - The main thread waits to join that thread during global destructor calls, > before the thread has initialized the thread_local slot. > > The former condition refers to a function-scope static thread_local > variable, or a global thread_local variable whose name has not yet been > referenced by any thread. > > Here is a reproducer (the same file is attached), compile with g++ 13.4.0 > regardless of optimization, and run under cygwin 3.6.5-1 > --------------------------------------------- > > #include > > struct the_type { > ~the_type() {} > }; > struct myjthread { > template > myjthread(F f): thr(f) {} > ~myjthread() { thr.join(); } > std::thread thr; > }; > > thread_local the_type g_v; > > > int main() { > // if main thread accesses the thread_local variable first, pattern2 > doesn't matter > //g_v = {}; > static myjthread t([] { > //std::this_thread::sleep_for(std::chrono::seconds(1)); //< this sleep > might increase reproducibility > > // pattern1: static thread_local > static thread_local the_type s_v; > > // pattern2: global thread_local; its slot is allocated in this thread > //g_v = {}; > }); > } > > --------------------------------------------- > > This issue was observed as a random hang in the LLVM test suite. > > Although the triggering thread_local variable in LLVM can be removed, I > hope the runtime can be fixed. Thanks for the report. I looked into the issue and found the cause is in newlib. I have confirmed that the following patch fixes the issue, but I am not very sure unlocking mutex here is safe. Let me consider a bit more. diff --git a/newlib/libc/stdlib/__call_atexit.c b/newlib/libc/stdlib/__call_atexit.c index 710440389..44f1f6acc 100644 --- a/newlib/libc/stdlib/__call_atexit.c +++ b/newlib/libc/stdlib/__call_atexit.c @@ -114,6 +114,11 @@ __call_exitprocs (int code, void *d) ind = p->_ind; +#ifndef __SINGLE_THREAD__ + /* Unlock __atexit_recursive_mutex; otherwise, the function fn() may + deadlock if it waits for another thread which calls atexit(). */ + __lock_release_recursive(__atexit_recursive_mutex); +#endif /* Call the function. */ if (!args || (args->_fntypes & i) == 0) fn (); @@ -121,6 +126,9 @@ __call_exitprocs (int code, void *d) (*((void (*)(int, void *)) fn))(code, args->_fnargs[n]); else (*((void (*)(void *)) fn))(args->_fnargs[n]); +#ifndef __SINGLE_THREAD__ + __lock_acquire_recursive(__atexit_recursive_mutex); +#endif /* The function we called call atexit and registered another function (or functions). Call these new functions before -- Takashi Yano -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple