delorie.com/archives/browse.cgi | search |
X-Recipient: | archive-cygwin AT delorie DOT com |
DomainKey-Signature: | a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id |
:list-unsubscribe:list-subscribe:list-archive:list-post | |
:list-help:sender:date:from:to:cc:subject:message-id | |
:mime-version:content-type; q=dns; s=default; b=HC+1R/LgQqkNyCcT | |
J3mSGLPbWOuxgDDPP1PVXnCPXxnW++AtDNQl/Qbrl8kpBuVNuZX+Vpw9OXeM/sIO | |
EddYB2zX4x9ZG0cp0UfTRfDDy7hCZPUhV8wP33CUciopQuJKo9tj35N3MaadKCdj | |
WGYTmCHe3+I+gfX+tzXrU1mSOpA= | |
DKIM-Signature: | v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id |
:list-unsubscribe:list-subscribe:list-archive:list-post | |
:list-help:sender:date:from:to:cc:subject:message-id | |
:mime-version:content-type; s=default; bh=ryhZBz7CfVV32rUSg69kjf | |
YQh1Q=; b=fV3v951dcyOSDX9jeofnRGnzLd3cApgY8m7IC8NNTpkyQCpMA0M0vs | |
Oka9/uTgNIWcI7a9wkkucL5SLrm443zefQ8duQUl3KtfnRgMGiFDPZ8JwVNRryt8 | |
pOdaS/utW3o97S3v9XY46dFI7XoOMT0wi9TyDAajG+yjC5pMKLrGo= | |
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
List-Id: | <cygwin.cygwin.com> |
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
List-Archive: | <http://sourceware.org/ml/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
Sender: | cygwin-owner AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
Delivered-To: | mailing list cygwin AT cygwin DOT com |
Authentication-Results: | sourceware.org; auth=none |
X-Virus-Found: | No |
X-Spam-SWARE-Status: | No, score=-1.8 required=5.0 tests=AWL,BAYES_00,HDRS_LCASE,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 |
X-Spam-User: | qpsmtpd, 2 recipients |
X-HELO: | artax.karlin.mff.cuni.cz |
Date: | Thu, 20 Nov 2014 21:22:31 +0100 (CET) |
From: | Mikulas Patocka <mikulas AT artax DOT karlin DOT mff DOT cuni DOT cz> |
To: | Corinna Vinschen <corinna-cygwin AT cygwin DOT com> |
cc: | cygwin AT cygwin DOT com |
Subject: | Re: Instability with signals and threads |
Message-ID: | <alpine.DEB.2.02.1411202055420.8559@artax.karlin.mff.cuni.cz> |
User-Agent: | Alpine 2.02 (DEB 1266 2009-07-14) |
X-Personality-Disorder: | Schizoid |
MIME-Version: | 1.0 |
> Never mind that. I can fix your testcase by calling _my_tls.remove with > INFINITE as parameter in both places. If I drop one of them, your > testcase will invariable fail at one point. With both INFINITE params > in place, your testcase is now running half an hour without problems. For me, this change doesn't fix the testcase, it just reduces the probability that it hangs. With this change, the testcase still locks up, but with a different stacktrace: thread1: Sleep _yield pthread::create sigdelayed ?? _cygwin_exit_return ?? _cygtls::call2 thread2: SetEvent muto::release init_cygheap::find_tls _cygtls::init_thread thread3: WriteFile sig_send timer_thread cygthread::callfunc cygthread::stub _cygtls::call2 thread4: VirtualFree thread_wrapper thread5: only ntdll stuff So, apparently, there is another bug, where thread->cygtls isn't being set and pthread::create loops endlessly calling yield. > Thinking about it, the fact that _cygtls::remove allows to apply a > non-INFINITE wait is rather strange, isn't it? Calling remove_tls with > a 0 wait, it allows to return the function silently, without actually > having removed the thread from the list. This is bound to go downhill > at one point and looks like a kludge to me to circumvent some potential > hang in another situation... Looking at CVS history, the "wait" argument was added to cygtls.cc version 1.2 with a comment: "Add a 'wait' argument to control how long we wait for a lock before removing." There is no explanation why is it needed. > I'm not exactly sure if that works as intended. I will apply this patch > and create a new Cygwin snapshot on https://cygwin.com/snapshots/ in a > couple of minutes. I'd appreciate if you and others would give it an > exhaustive test. New spurious hangs or SEGVs in other situations which > so far worked fine would be good indicators for another problem in the > code. Yes, I think it's correct to remove the wait argument. > Other than that, there's certainly some room for improvement. Calling > threadlist[idx]->remove from the find_tls exception handler looks > extremly hairy to me. I wonder if that should be called at all at this > point, or if there shouldn't be better some "simplified" removal > operation which doesn't require the _cygtls pointer. If the thread > doesn't exist anymore, so does its _cygtls area. I suggest to remove that exception handler at all. This thing can't ever work reliably - it could reduce probability of crashes but not eliminate them. Even if we handled the page fault correctly - what happens if some other thread allocates a different object at the location that belonged to the tls before? - then find_tls thinks that this different object is tls and corrupts it. I suggest to remove the exception handler and if it results in any regressions, fix them properly. Mikulas > Thanks, Corinna -- > Corinna Vinschen -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |