delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2014/11/20/11:22:32

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; q=dns; s=
default; b=eY9fPbLODLo/gnt49u+IDDcKfxTidlztPfvNzpx7Awu1vjDEnslZk
dNqbtN3IQN/hOrCKNPcg3ENDspgIMhE5D0KmWDZIMO4BIIl6wi+Ed+0xhOkHbG+3
MsMr76KhXaYScZUS0tZ/+fcJVqWc1IIauNvlUpj1zjZH0xHhHq5KX8=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; s=default;
bh=uPWesY1fmNGTw1jU1E1YDstD6Dk=; b=vPdtX/3aYY1cS1HPbS4aNxI1OjFF
Kbu0uR7TefX3hIjmn685Aa5u2e2UX7BCKFVT/mYbjVJUgYKx7gM/ZcKo+17N59Vk
MhHnupoHzjOGH9Rt1m8xql91SKPUL3RraBhfEDiQcslbmqgF7zm2kWzn5OMCqhCj
DulbKU2f2W4Mlww=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-5.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.2
X-HELO: calimero.vinschen.de
Date: Thu, 20 Nov 2014 17:22:10 +0100
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Instability with signals and threads
Message-ID: <20141120162210.GZ3810@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <alpine DOT DEB DOT 2 DOT 02 DOT 1411191708220 DOT 32609 AT artax DOT karlin DOT mff DOT cuni DOT cz> <20141120100001 DOT GL3810 AT calimero DOT vinschen DOT de>
MIME-Version: 1.0
In-Reply-To: <20141120100001.GL3810@calimero.vinschen.de>
User-Agent: Mutt/1.5.23 (2014-03-12)

--XG0jWBK27HhJN4nS
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Nov 20 11:00, Corinna Vinschen wrote:
> Hi Mikulas,
>=20
> On Nov 19 17:42, Mikulas Patocka wrote:
> > Hi
> >=20
> > I have a program that sets a repetitive timer with setitimer and spawns=
=20
> > several threads.
> >=20
> > The program is very unstable on cygwin, it locks up in few minutes.
> >=20
> > The bug manifests itself in the following way: the signal thread calls=
=20
> > cygheap->find_tls to find a thread to deliver the signal to. find_tls=
=20
> > generates an exception when scanning the threadlist, jumps to the __exc=
ept=20
> > block and calls threadlist[idx]->remove(INFINITE).
> >=20
> > The method threadlist[idx]->remove is called with invalid "this" pointe=
r=20
> > (sometimes it is zero, sometimes it points to unmapped memory), generat=
es=20
> > another exception on "initialized =3D 0" line and becomes stuck on this=
=20
> > assignment.
>=20
> Now that you mention it, it makes sense.  The exception gets triggered
> by accessing an invalid member of threadlist.  Using the very same
> member in another method call looks.... borderline, to say the least.
>=20
> > I found out that when I modify the remove_tls method so that it always=
=20
> > acquires the lock and removes the thread from the threadlist (change=20
> > "tls_sentry here(wait)" to "tls_sentry here(INFINITE)"), the bug goes a=
way=20
> > and the multithreaded program is stable.
>=20
> [Noted your augmenting comment preceeding the testcase in your other mail]
>=20
> > Alternativelly - the crash can be fixed if we change "_my_tls.remove (0=
)"=20
> > to "_my_tls.remove (INFINITE)" in thread_wrapper (though, there is anot=
her=20
> > _my_tls.remove (0) call in dll_entry in winsup/cygwin/init.cc and it co=
uld=20
> > trigger the same crash)
>=20
> I don't think so.  In dll_init, the call is done inside a DLL_THREAD_DETA=
CH
> for this very thread, so &_my_tls is still a valid pointer.

Never mind that.  I can fix your testcase by calling _my_tls.remove with
INFINITE as parameter in both places.  If I drop one of them, your
testcase will invariable fail at one point.  With both INFINITE params
in place, your testcase is now running half an hour without problems.

Thinking about it, the fact that _cygtls::remove allows to apply
a non-INFINITE wait is rather strange, isn't it?  Calling remove_tls
with a 0 wait, it allows to return the function silently, without
actually having removed the thread from the list.  This is bound to
go downhill at one point and looks like a kludge to me to circumvent
some potential hang in another situation...

I'm not exactly sure if that works as intended.  I will apply this patch
and create a new Cygwin snapshot on https://cygwin.com/snapshots/ in a
couple of minutes.  I'd appreciate if you and others would give it an
exhaustive test.  New spurious hangs or SEGVs in other situations which
so far worked fine would be good indicators for another problem in the
code.

Other than that, there's certainly some room for improvement.  Calling
threadlist[idx]->remove from the find_tls exception handler looks
extremly hairy to me.  I wonder if that should be called at all at this
point, or if there shouldn't be better some "simplified" removal
operation which doesn't require the _cygtls pointer.  If the thread
doesn't exist anymore, so does its _cygtls area.


Thanks,
Corinna

--=20
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

--XG0jWBK27HhJN4nS
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJUbhUyAAoJEPU2Bp2uRE+ghtMQAIngYpp5BSj6f1pfL7u9lRRx
+tpMVJcRpIm1ZU1MgDReXaciKtajubBtD8/SrUPbw6LD0lqleuvc8nMgqeQWCe7Y
nq1mppfJdmJrHznUrsmo7cdaHYe+8XPp3BedAQzMCB1jCZ1O9J+T70nS9/+yxKD7
Q2bq0yASOG0y0d/AJ72Ervr0JHM3RuWzwWtvXZ+lC1hm3Zv5CzBLYdFBC2GdyBVn
9mvG8kfiwVlpSsZiPt5bASLsmOBtrrlZHdV6tMO3qg+v2tqfhSS0KFD3pVnOTrLP
GQqXH79DhBd1DStdYyvxFA2US47HQjoeFklFatOFP0jneFO8d/78XHyldWi+NOHw
NWeBQNJFAttqvg0RKy2yQlu3eo52B1s+F6iM3bwv3cqTfL24/TWZp/X969Fyc8uH
JqOJrQEm1Jbo3jZTme7EuTtSg7VSkFjoS+qbTY9yfqG7VhgdOGwMBK8aLAuHtqbq
76lEjR1xne9Q8ZmyxvX28Hsag2X/e6DhEbUfFZcEvyh26+foc4lE/FOav/2G9lP8
QDQg9NHlsCDeMqmTu1U/pMKAuccsc7c/EZVWFy2v0nT0a7rsfQmXu71jznFw74r1
w0H2zjHceGIFrJ8AaQfTn8zBoFRp3XZLF/IdQcJGCNpq5P9Kb9+vtA3KHp+yvCP3
QPT/OshfB5K4IHZXuzS7
=L13T
-----END PGP SIGNATURE-----

--XG0jWBK27HhJN4nS--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019