X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:message-id:date:from:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; q=dns; s=default; b=tkfPhzB8GeOI+pdouZI8J+QHaPejubn2P0uzWm0Hw72 YkDzJb0ujAdgDigRNKNOH51qrfL1swzHiaLo2vEbNOYdvZC9NsGVQC00yT4t/VN/ urFx9pLrkmImF+M/Gcs5HplR9ybnY8r/9MzgCAYkBlTqqiEvC1x4laD3L+cW5I+Q = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:message-id:date:from:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; s=default; bh=twgYQ5tmgNhf3rFc1zbXAm67Vr8=; b=uFagcIuU88s0GrJnH TdNAdulwHNIVd63Ae1QVN72OoQ3tkojR2qown8kncQZ++96kQkLhX5AcgXzVUv9u uKsWolWq5T9VxIQsCjH/2rKymoXFVOQePPLY3pSow+7qBbujSg77ENoKgx6Hc/nD 2kOEtBcP2y79p9M70UosVw0dk0= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 X-HELO: limerock02.mail.cornell.edu X-CornellRouted: This message has been Routed already. Message-ID: <53E4D01B.9010005@cornell.edu> Date: Fri, 08 Aug 2014 09:26:51 -0400 From: Ken Brown User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: (call-process ...) hangs in emacs References: <53DB8D23 DOT 7060806 AT alice DOT it> <20140801133225 DOT GD25860 AT calimero DOT vinschen DOT de> <53DEDBBA DOT 20102 AT cornell DOT edu> <20140804080034 DOT GA2578 AT calimero DOT vinschen DOT de> <53DF8BDC DOT 8090104 AT cornell DOT edu> <20140804134526 DOT GK2578 AT calimero DOT vinschen DOT de> <53E0CC2D DOT 4080305 AT cornell DOT edu> <20140805135830 DOT GA9994 AT calimero DOT vinschen DOT de> <53E11A93 DOT 9070800 AT cornell DOT edu> <20140805184047 DOT GC13601 AT calimero DOT vinschen DOT de> <53E3685B DOT 8050508 AT cornell DOT edu> <53E39BAD DOT 3010004 AT redhat DOT com> <53E3CB46 DOT 1020909 AT cornell DOT edu> <53E3F2AE DOT 7030608 AT redhat DOT com> In-Reply-To: <53E3F2AE.7030608@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes On 8/7/2014 5:42 PM, Eric Blake wrote: > On 08/07/2014 12:53 PM, Ken Brown wrote: >> On 8/7/2014 11:30 AM, Eric Blake wrote: >>> On 08/07/2014 05:51 AM, Ken Brown wrote: >>>> >>>> I think I found the problem with NORMAL mutexes. emacs calls >>>> pthread_atfork after initializing the mutexes, and the resulting >>>> 'prepare' handler locks the mutexes. (The parent and child handlers >>>> unlock them.) So when emacs calls fork, the mutexes are locked, and >>>> shortly thereafter the Cygwin DLL calls calloc, leading to a deadlock. >>>> Here's a gdb backtrace showing the sequence of calls: >>> >>> Arguably, that's an upstream bug in emacs. POSIX has declared >>> pthread_atfork to be fundamentally useless; it is broken by design, >>> because you cannot use it for anything that is not async-signal-safe >>> without risking deadlock. And (except for sem_post()), NONE of the >>> standardized locking functions are async-signal-safe. >>> >>> http://austingroupbugs.net/view.php?id=858 >>> >>> That said, it would still be nice to support this, since even though the >>> theory says it is broken, there are still lots of (broken) >>> programs/libraries still trying to use it. >> >> So what do you think emacs should do instead of using pthread_atfork? Or >> is it better to just remove it? I don't know how likely it is that this >> would cause a problem. > > The POSIX recommendation is that multithreaded apps limit themselves > solely to async-signal-safe functions in the window between fork and > exec (or to use pthread_spawn instead of fork/exec). I don't know what > emacs is trying to do in that window, but at this point, it's certainly > worth reporting it upstream. If you need a pointer to the full list of > async-signal-safe functions: > > http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04 > and search for "The following table defines a set of functions that > shall be async-signal-safe." > > The most common deadlocks when violating async-signal-safety rules look > like this in single-threaded programs: > > function calls malloc() > malloc() grabs a non-recursive mutex > async signal arrives > signal handler called > signal handler calls malloc() > malloc() can't grab the mutex - deadlock > > and this counterpart in multithreaded programs: > > thread1 calls malloc() > malloc() grabs a non-recursive mutex > thread 2 gains control and calls fork() > because of the fork, thread1 no longer exists to release the lock > child process calls malloc() > malloc() tries to grab mutex, but it is locked with no thread to > release it > > Switching malloc() to a recursive lock may or may not "solve" the > single-threaded deadlock (in that malloc can now obtain the mutex), but > it is probably NOT what you want to happen (unless malloc is fully > re-entrant, the inner instance will see incomplete data and either be > totally clobbered itself, or else totally clobber the outer instance > when it returns). So it's GOOD that malloc does NOT use a recursive > mutex by default. > > In the multithreaded case, you are flat out hosed. Switching to a > recursive lock does not change the picture - you are still deadlocked > waiting on thread1 to release the lock, but thread1 doesn't exist. Thanks for the explanations, Eric. I've filed an emacs bug report: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18222 Ken -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple