delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2014/11/20/05:00:23

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; q=dns; s=
default; b=Bw9m7McS/NlekmAJoGBcY5KchoU+wmmt/C4SesuKb/2e8SFSFyDYI
KP1Re/6sbcHmd81ZraS7EWXjTTqjBdaQkqiEadpmZZloHNd5LaOPlNktsVWI9ZKS
ealCNSaMFhawi/r/+K/xNqAvVxMnJxL27qjpjiqgmwcQqlImYyShdc=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; s=default;
bh=HpKDTe5taOks0nKkDQSeoRy8Y0E=; b=iPuxJD44ZkcphQH/EfjZ2HXKGhby
gUYUYX3HRji7G9RsYvJRxo4j42mbeJFI2imLG49Uteh+lfF/X3RjOBwnfx+8StpV
gwW5BwVdBoEth0qatAzr3lPqqMkMkst1Cf+YpEePVT5GijS5wRnG2qllbpYROfRM
r/z3a85zWhNUTjg=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-5.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2
X-HELO: calimero.vinschen.de
Date: Thu, 20 Nov 2014 11:00:01 +0100
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Instability with signals and threads
Message-ID: <20141120100001.GL3810@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <alpine DOT DEB DOT 2 DOT 02 DOT 1411191708220 DOT 32609 AT artax DOT karlin DOT mff DOT cuni DOT cz>
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.02.1411191708220.32609@artax.karlin.mff.cuni.cz>
User-Agent: Mutt/1.5.23 (2014-03-12)

--Cy+5HEalSgyXkpVS
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Mikulas,

On Nov 19 17:42, Mikulas Patocka wrote:
> Hi
>=20
> I have a program that sets a repetitive timer with setitimer and spawns=
=20
> several threads.
>=20
> The program is very unstable on cygwin, it locks up in few minutes.
>=20
> The bug manifests itself in the following way: the signal thread calls=20
> cygheap->find_tls to find a thread to deliver the signal to. find_tls=20
> generates an exception when scanning the threadlist, jumps to the __excep=
t=20
> block and calls threadlist[idx]->remove(INFINITE).
>=20
> The method threadlist[idx]->remove is called with invalid "this" pointer=
=20
> (sometimes it is zero, sometimes it points to unmapped memory), generates=
=20
> another exception on "initialized =3D 0" line and becomes stuck on this=
=20
> assignment.

Now that you mention it, it makes sense.  The exception gets triggered
by accessing an invalid member of threadlist.  Using the very same
member in another method call looks.... borderline, to say the least.

> I found out that when I modify the remove_tls method so that it always=20
> acquires the lock and removes the thread from the threadlist (change=20
> "tls_sentry here(wait)" to "tls_sentry here(INFINITE)"), the bug goes awa=
y=20
> and the multithreaded program is stable.

[Noted your augmenting comment preceeding the testcase in your other mail]

> Alternativelly - the crash can be fixed if we change "_my_tls.remove (0)"=
=20
> to "_my_tls.remove (INFINITE)" in thread_wrapper (though, there is anothe=
r=20
> _my_tls.remove (0) call in dll_entry in winsup/cygwin/init.cc and it coul=
d=20
> trigger the same crash)

I don't think so.  In dll_init, the call is done inside a DLL_THREAD_DETACH
for this very thread, so &_my_tls is still a valid pointer.

> I'd like to ask - what's the reason for not waiting for the lock in=20
> remove_tls? If the lock is already locked, remove_tls does nothing - but=
=20
> the _cygtls structure is freed anyway, so that there is dangling pointer=
=20
> no the thread list. Do you think that we can drop this "wait" argument an=
d=20
> always wait for the lock in remove_tls?

Unfortunately I can't tell you about the reasoning behind this.  The guy
who created this code has left the project, so we just have to work with
the status quo.

> Another possible bug - when find_tls exits, it drops the tls_sentry lock=
=20
> and returns the pointer to _cygtls. What happens if the thread owning the=
=20
> tls exits at this point? It seems that there is nothing that prevents it=
=20
> from exiting and that the caller of find_tls (sigpacket::process) will=20
> work with a pointer to invalid thread. It seems that we need to add some=
=20
> reference count to _cygtls to prevent it from disappearing while we are=
=20
> trying to send a signal to it. (or keep tls_sentry::lock locked until=20
> sigpacket::process is done with the signal, though I don't know if keepin=
g=20
> the lock for so long won't cause deadlocks).

I'm going to investigate this in the next couple of days.  However,
since you're obviously willing and able to debug this situation
yourself, I would very much appreciate your further input.  If you have
any fun to provide patches to Cygwin, please feel free to read
https://cygwin.com/contrib.html and follow up on the cygwin-developer
and cygwin-patches mailing lists.  The copyright assignment is still
required for non-trivial patches (~10 lines rule).  I hope that's ok.


Thanks a lot,
Corinna

--=20
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

--Cy+5HEalSgyXkpVS
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJUbbuhAAoJEPU2Bp2uRE+gIAUP/jwe9pU/JD54x9PtCl56Z+dI
idZpVU2lZeZy21d8rA9reWttAwXhkw4aKJ5qgS/3O92FYhNWe3xQ40Wzw9wXIc9E
WKL2nUIhtULRxfMGz5ClNaccoDxsgxgOGDOXYDp+uLhKR9i46wsoyvh7qrEpp1n1
kX2WHrqXbyGrWTmk0/yVhUzAzxJlhyzijLOr0dmsfbNAuI0+1VzXBLHGd1wVtzzC
Z4BRd5HBilV4KYKX09Rl+VGHsXhgvREb9M6sX1yKLeA65+kc56IdIwX6Z/xOOLaF
u8jKX/MHYD3LXCs7zHdtOrJTupywWIg3ydEaAizpn4ugMo68FNn3pKHHQwEyuFi+
PrO4nnkSSCur5gANbkzrwxoZcQHsTz0UeAbELPWcX0w6wNuwrffup30qM2qdXd3e
b0DeWF2n5ajGjvmqnHLeEDVQqIV5gfEPMIV0ee5zTTX7HeBJZpCf4HOKbOGZL7Nd
R4z8e8VF/eIey1oGTEwqHJNtFAeOI2vV9E1Iaqbydtv1+DEz3EpHZ6vTMDdztxnI
JPN+ON4Hjp/9K5jAuMvUikAWDrEKxo5MRWZuHvWCBD0t7UioCVaOPAqynX1ZGSY4
Tj+QwuUU5NpGvu9P/ZODDIARf4FXl0aEu6s3OfV7q8uI2B3Od+5i9T10qcLu85ro
mlYrC9DwyNN1Ranm7wE9
=Fr4V
-----END PGP SIGNATURE-----

--Cy+5HEalSgyXkpVS--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019