X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,RCVD_IN_HOSTKARMA_YE X-Spam-Check-By: sourceware.org X-Mail-Handler: Dyn Standard SMTP by Dyn X-Report-Abuse-To: abuse AT dyndns DOT com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX19u2KgE3o2it9VZ4WYzgIY/ Date: Thu, 16 Aug 2012 14:32:16 -0400 From: Christopher Faylor To: cygwin AT cygwin DOT com Subject: Re: Cygwin crashes in kill_pgrp, _pinfo truncation issue. Message-ID: <20120816183216.GA16862@ednor.casa.cgf.cx> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Thu, Aug 16, 2012 at 08:20:37PM +0400, Andrey Khalyavin wrote: >On Wed, 15 Aug 2012 10:11:16 -0400, Christopher Faylor wrote: >>On Wed, Aug 15, 2012 at 04:54:42PM +0400, Andrey Khalyavin wrote: >>>I finally got a cygwin crash dump from our build bots. It shows, that >>>cygwin1.dll crashes in kill_pgrp function on line: >>> (pid > 1 && p->pgid != pid) || >>>where p is a pointer to _pinfo. This function enumerates all _pinfo's >>>and executes this line for all of them which pass p->exists() check. >>>In crash dump p points to _pinfo that has process_state equal to >>>PID_IN_USE | PID_EXECED. >> >>Thanks for tracking this down. I've added a check for "execed" to >>_pinfo::exists. >> >>cgf >I updated core libraries from 20120803 snapshot to 20120815 snapshot >and now bash crashes when I execute rm -rf dir. Reproducibility is >strange. It crashed for hours when I entered >cd /tmp >mkdir a >rm -rf a >commands but now suddenly stopped crashing in this case. >It is still crashes on rm -rf in the real script we use though. > >Crash happens in setup_handler function on line > HANDLE hth = (HANDLE) *tls; >because tls->tid equals to zero. Definition of this operation is in >sygtls.h: operator HANDLE () const {return tid->win32_obj_id;}. >setup_handler is called from sigpacket::process which in turn >called from wait_sig. Signal number is 20, signal code is 28. >All fields of tls structure are zero with exception stacklock equal >to 1 and stackptr equal to address of tls->stack. Sounds like a race between thread creation and signal handling. I have added some defensive code in the latest snapshot. cgf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple