delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2014/08/07/14:54:18

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:message-id:date:from:mime-version:to:subject
:references:in-reply-to:content-type:content-transfer-encoding;
q=dns; s=default; b=W0/4ZM1x41DSG/C1adhodfd3CM91oaGXnIChye4ztqa
HN2aUn0rEB8RHc0aIhcrn9jw8UHdZ3M8RgmF14yKLa1ggpcs5a+yJJB/9OTg4j6z
oqXKg9aLNR8N+YnsZvwTBxnP8SpISQJAw3oMsvUjtR4VNXZjWmf1eE1EDAQfPRbw
=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:message-id:date:from:mime-version:to:subject
:references:in-reply-to:content-type:content-transfer-encoding;
s=default; bh=F9TsmLSqYKD7CeNRFiCAAIT/WaA=; b=BB2eEIMZLf0UyPnoB
/+//GkcUtWe8hZ+DKMfB1j0A1aqtvIb9kUZZDzxHAjPywvfrJt9M+l1LdIqMoCyc
i1Kmj2UvHJ2Bn+h1UPt74UoFk9lNsw7J6cMXSizloBONbN4jSGvwinIDi0YnPHRy
rFXyeriuSj+c0Q0c0IkJ7I2Iz4=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2
X-HELO: limerock01.mail.cornell.edu
X-CornellRouted: This message has been Routed already.
Message-ID: <53E3CB3E.4010801@cornell.edu>
Date: Thu, 07 Aug 2014 14:53:50 -0400
From: Ken Brown <kbrown AT cornell DOT edu>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: (call-process ...) hangs in emacs
References: <20140801133225 DOT GD25860 AT calimero DOT vinschen DOT de> <53DEDBBA DOT 20102 AT cornell DOT edu> <20140804080034 DOT GA2578 AT calimero DOT vinschen DOT de> <53DF8BDC DOT 8090104 AT cornell DOT edu> <20140804134526 DOT GK2578 AT calimero DOT vinschen DOT de> <53E0CC2D DOT 4080305 AT cornell DOT edu> <20140805135830 DOT GA9994 AT calimero DOT vinschen DOT de> <53E11A93 DOT 9070800 AT cornell DOT edu> <20140805184047 DOT GC13601 AT calimero DOT vinschen DOT de> <53E3685B DOT 8050508 AT cornell DOT edu> <20140807125137 DOT GV13601 AT calimero DOT vinschen DOT de>
In-Reply-To: <20140807125137.GV13601@calimero.vinschen.de>
X-IsSubscribed: yes

On 8/7/2014 8:51 AM, Corinna Vinschen wrote:
> Hi Ken,
>
> On Aug  7 07:51, Ken Brown wrote:
>> Hi Corinna,
>>
>> On 8/5/2014 2:40 PM, Corinna Vinschen wrote:
>>> I'm glad to read that, but I'm still a little bit concerned.  If your
>>> code works with ERRORCHECK mutexes but hangs with NORMAL mutexes, you
>>> *might* miss an error case.
>>>
>>> I'd suggest to tweak the pthread_mutex_lock/unlock calls and log the
>>> threads calling it.  It looks like the same thread calls malloc from
>>> malloc for some reason and it might be interesting to learn how that
>>> happens and if it's really ok in this scenario, because it seems to
>>> be unexpected by the code.
>>
>> I think I found the problem with NORMAL mutexes.  emacs calls pthread_atfork
>> after initializing the mutexes, and the resulting 'prepare' handler locks
>> the mutexes.  (The parent and child handlers unlock them.)  So when emacs
>> calls fork, the mutexes are locked, and shortly thereafter the Cygwin DLL
>> calls calloc, leading to a deadlock. Here's a gdb backtrace showing the
>> sequence of calls:
>
> First question:  Why does emacs use its own malloc on Cygwin rather
> than the system-provided one?  Is that really necessary?

Cygwin's malloc lacks a few features that emacs requires because of the 
unusual way emacs is built.  The most important such features (or maybe 
even the only ones) are malloc_set_state and malloc_get_state.
>
>> #0  malloc (size=size AT entry=40) at gmalloc.c:919
>> #1  0x0053fc28 in calloc (nmemb=1, size=40) at gmalloc.c:1510
>> #2  0x61082074 in calloc (nmemb=1, size=40)
>>      at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/malloc_wrapper.cc:100
>> #3  0x61003177 in operator new (s=s AT entry=40)
>>      at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/cxx.cc:23
>> #4  0x610fc9d3 in pthread_mutex::init (mutex=0x61187d34 <reent_data+852>,
>>      attr=0x0, initializer=0x12)
>>      at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/thread.cc:3118
>> #5  0x610fcc13 in pthread_mutex_lock (mutex=0x61187d34 <reent_data+852>)
>>      at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/thread.cc:3170
>> #6  0x611319d8 in __fp_lock (ptr=0x61187cd0 <reent_data+752>)
>
> Right, __fp_lock needs a pthread lock and since this lock hasn't been
> used yet, it has to create it.  The pthread_mutex creation calls the
> new operator which in turn calls calloc.
>
>>      at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/findfp.c:287
>> #7  0x61154f75 in _fwalk (ptr=0x28d544,
>>      function=function AT entry=0x611319c0 <__fp_lock>)
>>      at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/fwalk.c:50
>> #8  0x61131dea in __fp_lock_all ()
>>      at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/findfp.c:307
>> #9  0x610fa45e in pthread::atforkprepare ()
>>      at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/thread.cc:2031
>> #10 0x61076292 in lock_pthread (this=<synthetic pointer>)
>>      at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/sigproc.h:137
>> #11 hold_everything (x=<synthetic pointer>, this=<synthetic pointer>)
>>      at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/sigproc.h:169
>> #12 fork () at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/fork.cc:582
>>
>> Is there a better way to deal with this issue than using ERRORCHECK mutexes?
>
> Did you check if you get an error from pthread_mutex_lock on the
> second invocation of malloc?  Is it EDEADLK?  If so, you can
> ignore the error, but if you want to go ahead without adding lots
> of error checking you might be better off using a RECURSIVE mutex.

I didn't check the error, but it seemed clear from the code that that 
was what was happening.  Yes, using a RECURSIVE mutex sounds like a good 
idea.  Or maybe it would be just as good to remove the call to 
pthread_atfork.  See my reply to Eric later in the thread.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019