delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2005/09/22/22:28:55

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Date: Thu, 22 Sep 2005 22:26:19 -0400
From: Christopher Faylor <cgf-no-personal-reply-please AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Funny hang with snapshop 20050920
Message-ID: <20050923022619.GB21253@trixie.casa.cgf.cx>
Reply-To: cygwin AT cygwin DOT com
References: <4333660B DOT 7060305 AT scytek DOT de>
Mime-Version: 1.0
In-Reply-To: <4333660B.7060305@scytek.de>
User-Agent: Mutt/1.5.8i

On Thu, Sep 22, 2005 at 10:18:51PM -0400, Volker Quetschke wrote:
>My favorite testcase (building OOo) started hanging again.
>
>(Un?-)fortunately not on one of my systems and we also didn't manage
>to reproduce with a reduced testcase. But the problem generally is:
>In a tcsh shell (2980) start a perl script (3012) that starts a cygwin
>program (a make clone) (3016) that starts a command in  a tcsh.
>
>Now the fun part begins, the started tcsh command is (should be) just
> "/usr/bin/tcsh -fc pwd" (3736)
>but there is another process started by this process (3176) that
> "/usr/bin/tcsh -fc pwd" (3176)
>has exactly the same command and that appears to hang. Below you see
>the output of a ps command:
>
>      PID    PPID    PGID     WINPID  TTY  UID    STIME COMMAND
>     3772       1    3772       3772  con 11290 18:59:34 /usr/bin/bash
>     2980    3772    2980       3124  con 11290 18:59:39 /usr/bin/tcsh
>     3616       1    3616       3616  con 11290 19:10:02 /usr/bin/bash
>     3452    3616    3452        444  con 11290 19:10:07 /usr/bin/tcsh
>     3012    2980    3012       3912  con 11290 18:10:56 /usr/bin/perl
>     3016    3012    3012       3916  con 11290 18:37:01 
>     /cygdrive/e/work/OOo/SRC680_m124/solenv/wntmsci10/bin/dmake
>     3736    3016    3012       3392  con 11290 18:37:01 /usr/bin/tcsh
>     3176    3736    3012       3176  con 11290 18:37:01 /usr/bin/tcsh
>     3804    3452    3804       3196  con 11290 18:40:17 /usr/bin/ps
>
>Attached you find the output of a "cat /proc/<pid>/*" for the two pids.
>
>But now the *really* strange part begins: You can break the hang by doing
>  "ls /proc/3176/fd" !?
>and the build continues (until the next hang).
>
>Sorry, we're unable to create a reduced testcase but we thought the
>strange symptoms might help pinpoint the problem.
>
>Attached you also find the cygcheck output of that system.
>
>I hope this helps a little bit,

Does sending a 'kill -CONT 3176' also unstick things?  Both situations send a
signal to the process.

How about attaching to the hung process with strace?  You didn't mention
that.

cgf

(who deeply regrets ever trying to fix the windows 98 crash problem)

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019