Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Date: Fri, 23 Sep 2005 20:11:10 -0400 From: Christopher Faylor To: cygwin AT cygwin DOT com Subject: Re: Funny hang with snapshop 20050920 Message-ID: <20050924001110.GA1390@trixie.casa.cgf.cx> Reply-To: cygwin AT cygwin DOT com References: <4333660B DOT 7060305 AT scytek DOT de> <20050923022619 DOT GB21253 AT trixie DOT casa DOT cgf DOT cx> <43348E75 DOT 7080309 AT scytek DOT de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <43348E75.7080309@scytek.de> User-Agent: Mutt/1.5.8i On Fri, Sep 23, 2005 at 07:23:33PM -0400, Volker Quetschke wrote: >Christopher Faylor wrote: >>On Thu, Sep 22, 2005 at 10:18:51PM -0400, Volker Quetschke wrote: >> >>>My favorite testcase (building OOo) started hanging again. >>>... >>>But now the *really* strange part begins: You can break the hang by doing >>>"ls /proc/3176/fd" !? >>>and the build continues (until the next hang). >>> >>>Sorry, we're unable to create a reduced testcase but we thought the >>>strange symptoms might help pinpoint the problem. >>> >>>Attached you also find the cygcheck output of that system. >> >>Does sending a 'kill -CONT 3176' also unstick things? Both situations >>send a >>signal to the process. > >Sorry, this question got lost, but ... > >>How about attaching to the hung process with strace? You didn't mention >>that. > >he tried to attach and strace was standing there without output. >A "ls /proc//fd" produced then the first four lines of the >attached strace log but tcsh still hung. You know, I noticed yesterday that there was some information missing from the strace output in the open_shared function and, of course, I didn't fix it. Oh well. That means that I don't get much from this strace output. >Several "ls /proc//fd" later it continued and produced the >rest of that logfile. > >Did you notice that the WINPID of the hanging tcsh is the same as >the PID? This is always the case if it hangs. That just means that the process has forked but hasn't execed anything. I don't think that's significant. >Additional info: Both tcsh processes exist with the respective >WINPID in taskmgr. I'd expect that they did or you wouldn't be able to attach to them. There is a new snapshot up there now. I think I've given up on the technique that I was trying to use to fix the Windows 98 bug. I've yanked out a lot of the code and simplified things but I hope I haven't caused the bug to reemerge. Could you try the 2005-09-23 snapshot? Same rules. I'd still like to know if sending a CONT to the hung process fixes it as well as ls /proc/nnn/fd and I'd still like to see the strace output if the process hangs again. Also could anyone who could duplicate the Windows 98 error popup dialog confirm or deny if it is still fixed? cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/