delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/11/19/06:30:36

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00
X-Spam-Check-By: sourceware.org
Message-ID: <4B052C11.6010307@ece.cmu.edu>
Date: Thu, 19 Nov 2009 12:29:21 +0100
From: Ryan Johnson <ryanjohn AT ece DOT cmu DOT edu>
User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: 1.5.25-15: pthread_join deadlocks
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

Hi all,

I'm hitting a deadlock with cygwin pthreads when joining on a 
short-lived thread -- for me the second such thread creation will almost 
never return. It looks *exactly* like a problem that others noticed as 
far back as early 2005 [0], and  from the output of strace on the test 
case (below) the culprit is almost certainly a racy optimization in 
__cygwin_lock_* for which a patch was submitted six months ago [1].

As of today my cygwin distribution is completely up to date. Any hope of 
an update coming out soon?

Regards,
Ryan

[0] 
http://coding.derkeiler.com/Archive/General/comp.programming/2005-02/0786.html
[1] http://www.mail-archive.com/cygwin-patches AT cygwin DOT com/msg04323.html

$ cat bug.cpp
#include <pthread.h>
#include <cassert>
#include <cstdio>
#define ANNOUNCE(what) fprintf(stderr, what "\n")
extern "C" void* run(void*) {
    ANNOUNCE("Running");
    return 0;
}
int main() {
    pthread_t tid;
    ANNOUNCE("Starting");
    for(int i=0; i < 10; i++) {
        ANNOUNCE("Creating thread");
        assert(0 == pthread_create(&tid, 0, &run, 0));
        ANNOUNCE("Joining thread");
        assert(0 == pthread_join(tid, 0));
    }
    ANNOUNCE("Done");
}

$ g++ -Wall -g -mthreads bug.cpp && strace 
--mask=all+thread+paranoid+debug+uhoh ./a.exe
**********************************************
Program name: C:\cygwin\home\Ryan\experiments\a.exe (pid 2860, ppid 1)
App version:  1005.25, api: 0.156
DLL version:  1005.25, api: 0.156
DLL build:    2008-06-12 19:34
OS version:   Windows NT-5.1
Heap size:    402653184
Date/Time:    2009-11-19 11:58:33
**********************************************
   48   32123 [main] a 2860 __cygwin_lock_lock: threadcount 1.  not locking
   24   32147 [main] a 2860 __cygwin_lock_unlock: threadcount 1.  not 
unlocking
   25   32172 [main] a 2860 __cygwin_lock_lock: threadcount 1.  not locking
Starting
   24   33868 [main] a 2860 __cygwin_lock_unlock: threadcount 1.  not 
unlocking
   23   33891 [main] a 2860 __cygwin_lock_lock: threadcount 1.  not locking
Creating thread
   23   34109 [main] a 2860 __cygwin_lock_unlock: threadcount 1.  not 
unlocking
  260   34369 [unknown (0x1C8C)] a 2860 pthread::thread_init_wrapper: 
started thread 0x100428B0 0xD8D008 0x61102D90 0x100428B0 0x401145 0x0
   33   34402 [unknown (0x1C8C)] a 2860 __cygwin_lock_lock: threadcount 
2.  locking
   39   34561 [main] a 2860 __cygwin_lock_lock: threadcount 2.  locking
Running
   91   34860 [unknown (0x1C8C)] a 2860 __cygwin_lock_unlock: 
threadcount 2.  unlocked

***** Child thread exits here *****

Joining thread

***** Main thread decides it doesn't need to release the lock *****

   22   35166 [main] a 2860 __cygwin_lock_unlock: threadcount 1.  not 
unlocking
   56   35222 [main] a 2860 __cygwin_lock_lock: threadcount 1.  not locking
Creating thread
   24   36990 [main] a 2860 __cygwin_lock_unlock: threadcount 1.  not 
unlocking
  156   37146 [unknown (0x10B8)] a 2860 pthread::thread_init_wrapper: 
started thread 0x100428B0 0xD8D008 0x61102D90 0x100428B0 0x401145 0x0
   29   37175 [unknown (0x10B8)] a 2860 __cygwin_lock_lock: threadcount 
2.  locking

***** Second child thread now blocked the lock which main thread holds *****

   25   37200 [main] a 2860 __cygwin_lock_lock: threadcount 2.  locking
Joining thread

***** Apparently recursive lock acquires work? *****

   25   38604 [main] a 2860 __cygwin_lock_unlock: threadcount 2.  unlocked


***** Unfortunately main still holds the lock and is now joined on the 
child it blocks *****


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019