Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <425DE7F1.9FA775A9@dessent.net> Date: Wed, 13 Apr 2005 20:48:01 -0700 From: Brian Dessent Organization: My own little world... MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: Losing track of processes? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Reply-To: cygwin AT cygwin DOT com "Shaffer, Kenneth" wrote: > I have a suite of scripts which process logs but get hung after two > hours. My initial looking into it shows that cygwin ps command thinks the > processes are present, but windows task manager doesn't see them at all. > It's as if the parent wasn't informed that it's child died. Perhaps a wait > system call isn't working or memory corruption of data structures > containing this information or race conditions on accessing data > structures, etc. > > I have run into this off and on since cygwin1.dll 1.5.12 always hoping > that new versions would make the problem go away. I seem to recall similar > posts by others. I'm now running the 1.5.15 4/12 snapshot. > > Anyway hoping there might be suggestions to help track this down. I'll try > strace one more time (when run with it before, the problem did not occur). If you're not using the -17 (test) version of bash, try that. Bash has a problem where if a PID is reused, it will get confused and continue to return the exit value of the original process with that PID and not retrieve the exit status of the second process that was spawned with a duplicated PID. Or something like that. (The details are in the archive.) Anyway, the -17 version includes a fix, but is marked 'test' so you won't get it unless you explicitly select that version. As to why you are seeing processes in ps that don't exist in task manager, I have no idea. It could be that Cygwin is still retaining information about those processes that have terminated because nothing has yet called wait() on them to retrieve their exit status. I don't think the notion of zombie processes exists in windows so Cygwin has to emulate it. But, that's just wild speculation. What I do know, is that if there is a real bug hiding in here somewhere it will never get fixed until someone can narrow it down to something that is reproducable. If you can manage to whittle down your script into a generic testcase that exhibits the problem, then at least someone could look into it. But until then, or until someone that can reproduce the problem and is familiar enough with cygwin can debug what's going on, I don't think anything is going to happen. Brian -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/