delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2014/08/07/17:42:24

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:message-id:date:from:mime-version:to:subject
:references:in-reply-to:content-type; q=dns; s=default; b=OzWBEW
nogW8nAsZXzLZujWBDUVgRvkFSc+vmQjVsNuv10SulqzQEprNcpCgMrLxyrZ19eG
vjEi9/fb+SaOS+1aR48D0TP4PMj0G41n0Qfg8Hm/k5kY1jaUquk2v0nByS5n+0CL
iAgkw/quIGdkHyZ9xJ1rqt1dCN2cfuLRpx/sw=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:message-id:date:from:mime-version:to:subject
:references:in-reply-to:content-type; s=default; bh=3AruX/QcqtJ1
Gx48UFYegZSOB9E=; b=DPiTBdBE1anvq/El+NpznO3lgPkgHSBh7QVoxNj1/0II
GaEFOJUHsCfT60o+Xkvt9TN41uIaa1wC7MazVqIny0YrSUE4JaQwucs+j864anP3
VwcoiGOQ3cQ6RY8xXbvkMaHOg6DNZVZhAXdYBcFcU0VijZAnRRY6nA/CpTn+bx0=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2
X-HELO: mx1.redhat.com
Message-ID: <53E3F2AE.7030608@redhat.com>
Date: Thu, 07 Aug 2014 15:42:06 -0600
From: Eric Blake <eblake AT redhat DOT com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: (call-process ...) hangs in emacs
References: <53DB8D23 DOT 7060806 AT alice DOT it> <CAK9Gx1cjj-7cDP7CunD7Bxz35L+SU9+4Ro3HRot5cwjcArudOA AT mail DOT gmail DOT com> <20140801133225 DOT GD25860 AT calimero DOT vinschen DOT de> <53DEDBBA DOT 20102 AT cornell DOT edu> <20140804080034 DOT GA2578 AT calimero DOT vinschen DOT de> <53DF8BDC DOT 8090104 AT cornell DOT edu> <20140804134526 DOT GK2578 AT calimero DOT vinschen DOT de> <53E0CC2D DOT 4080305 AT cornell DOT edu> <20140805135830 DOT GA9994 AT calimero DOT vinschen DOT de> <53E11A93 DOT 9070800 AT cornell DOT edu> <20140805184047 DOT GC13601 AT calimero DOT vinschen DOT de> <53E3685B DOT 8050508 AT cornell DOT edu> <53E39BAD DOT 3010004 AT redhat DOT com> <53E3CB46 DOT 1020909 AT cornell DOT edu>
In-Reply-To: <53E3CB46.1020909@cornell.edu>
OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg
X-IsSubscribed: yes

--8ltFd3v6ihddiBv5RA2K9MuAWs96wPjRT
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On 08/07/2014 12:53 PM, Ken Brown wrote:
> On 8/7/2014 11:30 AM, Eric Blake wrote:
>> On 08/07/2014 05:51 AM, Ken Brown wrote:
>>>
>>> I think I found the problem with NORMAL mutexes.  emacs calls
>>> pthread_atfork after initializing the mutexes, and the resulting
>>> 'prepare' handler locks the mutexes.  (The parent and child handlers
>>> unlock them.)  So when emacs calls fork, the mutexes are locked, and
>>> shortly thereafter the Cygwin DLL calls calloc, leading to a deadlock.
>>> Here's a gdb backtrace showing the sequence of calls:
>>
>> Arguably, that's an upstream bug in emacs.  POSIX has declared
>> pthread_atfork to be fundamentally useless; it is broken by design,
>> because you cannot use it for anything that is not async-signal-safe
>> without risking deadlock.  And (except for sem_post()), NONE of the
>> standardized locking functions are async-signal-safe.
>>
>> http://austingroupbugs.net/view.php?id=3D858
>>
>> That said, it would still be nice to support this, since even though the
>> theory says it is broken, there are still lots of (broken)
>> programs/libraries still trying to use it.
>=20
> So what do you think emacs should do instead of using pthread_atfork? Or
> is it better to just remove it?  I don't know how likely it is that this
> would cause a problem.

The POSIX recommendation is that multithreaded apps limit themselves
solely to async-signal-safe functions in the window between fork and
exec (or to use pthread_spawn instead of fork/exec).  I don't know what
emacs is trying to do in that window, but at this point, it's certainly
worth reporting it upstream.  If you need a pointer to the full list of
async-signal-safe functions:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#ta=
g_15_04
and search for "The following table defines a set of functions that
shall be async-signal-safe."

The most common deadlocks when violating async-signal-safety rules look
like this in single-threaded programs:

function calls malloc()
  malloc() grabs a non-recursive mutex
    async signal arrives
      signal handler called
        signal handler calls malloc()
          malloc() can't grab the mutex - deadlock

and this counterpart in multithreaded programs:

thread1 calls malloc()
  malloc() grabs a non-recursive mutex
thread 2 gains control and calls fork()
  because of the fork, thread1 no longer exists to release the lock
  child process calls malloc()
    malloc() tries to grab mutex, but it is locked with no thread to
release it

Switching malloc() to a recursive lock may or may not "solve" the
single-threaded deadlock (in that malloc can now obtain the mutex), but
it is probably NOT what you want to happen (unless malloc is fully
re-entrant, the inner instance will see incomplete data and either be
totally clobbered itself, or else totally clobber the outer instance
when it returns).  So it's GOOD that malloc does NOT use a recursive
mutex by default.

In the multithreaded case, you are flat out hosed. Switching to a
recursive lock does not change the picture - you are still deadlocked
waiting on thread1 to release the lock, but thread1 doesn't exist.

--=20
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


--8ltFd3v6ihddiBv5RA2K9MuAWs96wPjRT
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Public key at http://people.redhat.com/eblake/eblake.gpg

iQEcBAEBCAAGBQJT4/KuAAoJEKeha0olJ0NqU7gH/2J8PXyz4398n+VLHtlf3KtS
A9JddOh7Sa5dPVMtOmS902NMXePfLwRdnqMS0dSgo0YSPJYs0shNkEaHbqt+KxXV
1XxAEDkgb5UGEGVgXxn2kIWcfw4CXjZf8p/M9a8EM57XR7HR6OkG9gp9xi1hgvzE
l/AdjfKpjsipd49U+7Gy1wdkEJvra2L3CGNXIb7uE2NloY8uadClP+ixF4w/JWvG
qz1g7ERdjeHCR+ppC3C1htPRNR4RAXXeHF86sLdDStanDSitXTLFQ95judui53j+
/v22tvPo5qITg4eDA9tSvuabyeu2IVsjdAIuNuM2kQ6xpdIjJ9jY+6T9ki0/oac=
=mOaE
-----END PGP SIGNATURE-----

--8ltFd3v6ihddiBv5RA2K9MuAWs96wPjRT--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019