delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2019/04/04/08:44:31

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:mime-version:references:in-reply-to:from:date
:message-id:subject:to:content-type; q=dns; s=default; b=TJLFCSG
lodTSBz/mrz20SP0VdP3Qaycn0uy+9r7R3NTSSbC6HVkp/lC565mhA00hsU/NS4G
rcLsYhT8NxiG/QSnttcae6VC25eXWR7/tkL1Z8bEhl0B0F8ACxHrvufX1J82LKAs
aMMAux7wDiohHfSVip6S7vptnSMPQx09/0t0=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:mime-version:references:in-reply-to:from:date
:message-id:subject:to:content-type; s=default; bh=15clS150ty1+9
NKVAysXplf9THM=; b=B+eENp+gPQFdNh4AuUjHPeRqvA2XTOJIU1AChgbPxAF+u
eummAAY98BXyPFwM35lN5imFs/KLXa1q6IrtWgMVxwIWkfqxFmWXpcInAqeyC63H
Bxn+N5EDN4K9Xzce2gm4BIHrEY8pOePlI2wOiK5QfL0rQ1v5Zz7LPgcXXBHCus=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=ok, bell, OK, pubsopengrouporg
X-HELO: mail-it1-f194.google.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=bLn2i4GMqQnGAY6lNPwa5f7zdevvTtdSMrRohplu0EI=; b=Car3/r7yGjxAcXz42Xo0TCcmO8eZDT4m+lOEKJeePvityPDf3lEsFL/DYTnWoCTwEq yY+I3Fc0syO6V2MUHrCWD5N4E7sJOkS92Ly/KMTZJA3mdTx91fInI5BfD+gxMzT0iQ47 Clfl3wBfwrowyYvgQosWTzwXrrurDKY8tXrYxx4B4BxbTCU5t3oIFwL4sWstuD3+SAQr Icc449YMRSVb5Bk/QR4I35PSZmffrlhkdPezuf3cDSxmDeadscY+mJFLGk0+xNaHjes8 LgMwmVH6xV2L2r53gIYeBE7sEBXR0B9YkKP4vhd44KXBcIzNKWcuXC5uW9wDqOtUzq3c MUvA==
MIME-Version: 1.0
References: <CAOTD34bHSDJErA0B8Qt8Zqi54ciV5ZpRJdTa_pGs9Mp2PERsuw AT mail DOT gmail DOT com> <58A3598F DOT 2020405 AT maxrnd DOT com> <CAOTD34Z7VM=6=Ss_gCLS97c4sFNpnaT-+RgJq+xme-VyWYbbpw AT mail DOT gmail DOT com> <58A773C9 DOT 1080905 AT maxrnd DOT com> <CAOTD34ZHspOy0kSrxNbZCEDj++gRFUQOh2rmE08N9TZt3wXVrw AT mail DOT gmail DOT com> <58AACADF DOT 6080101 AT maxrnd DOT com> <CAOTD34YZGV_zKQLLhL1pSaNgRo6Gupj6_EpyxTKBjvVVbGHr2g AT mail DOT gmail DOT com> <58AB73B5 DOT 6040104 AT maxrnd DOT com> <CAOTD34YqZMD=e-U=r56bys7GfzHKYwjVUnjkQpngE+Y9nAL+EA AT mail DOT gmail DOT com>
In-Reply-To: <CAOTD34YqZMD=e-U=r56bys7GfzHKYwjVUnjkQpngE+Y9nAL+EA@mail.gmail.com>
From: "E. Madison Bray" <erik DOT m DOT bray AT gmail DOT com>
Date: Thu, 4 Apr 2019 14:43:59 +0200
Message-ID: <CAOTD34YYHE6qHhHFwwXq1VAJ6ME4oyZMsa=BbQx8txsH4p3puA@mail.gmail.com>
Subject: Re: Problem with zombie processes
To: cygwin AT cygwin DOT com
X-IsSubscribed: yes

On Tue, Feb 21, 2017 at 12:58 PM Erik Bray wrote:
>
> On Mon, Feb 20, 2017 at 11:54 PM, Mark Geisert wrote:
> > Erik Bray wrote:
> >>
> >> On Mon, Feb 20, 2017 at 11:54 AM, Mark Geisert wrote:
> >>>>
> >>>> So my guess was that Cygwin might try to hold on to a handle to a
> >>>> child process at least until it's been explicitly wait()ed.  But that
> >>>> does not seem to be the case after all.
> >>>
> >>>
> >>>
> >>> You might have missed a subtlety in what I said above.  The Python
> >>> interpreter itself is calling wait4() to reap your child process.  Cygwin
> >>> has told Python one of its children has died.  You won't get the chance
> >>> to
> >>> wait() for it yourself.  Cygwin *does* have a handle to the process, but
> >>> it
> >>> gets closed as part of Python calling wait4().
> >>
> >>
> >> To be clear, wait4() is not called from Python until the script
> >> explicitly calls p.wait().
> >> In other words, when run this step by step (e.g. in gdb) I don't see a
> >> wait4() call until the point where the script explicitly waits().  I
> >> don't see any reason Python would do this behind the scenes.
> >
> >
> > You're right.  I missed the wait in your script and ASSumed too much of the
> > Python interpreter :-( .
> >
> >
> >>>> Anyways, I think it would be nicer if /proc returned at least partial
> >>>> information on zombie processes, rather than an error.  I have a patch
> >>>> to this effect for /proc/<pid>/stat, and will add a few more as well.
> >>>> To me /proc/<pid>/stat was the most important because that's the
> >>>> easiest way to check the process's state in the first place!  Now I
> >>>> also have to catch EINVAL as well and assume that means a zombie
> >>>> process.
> >>>
> >>>
> >>>
> >>> The file /proc/<pid>/stat is there until Cygwin finishes cleanup of the
> >>> child due to Python having wait()ed for it.  When you run your test
> >>> script,
> >>> pay attention to the process state character in those cases where you
> >>> successfully read the stat file.  It's often S (stopped, I think) or R
> >>> (running) but I also see Z (zombie) sometimes.  Your script is in a race
> >>> with Cygwin, and you cannot guarantee you'll see a killed process's state
> >>> before Cygwin cleans it up.
> >>>
> >>> One way around this *might* be to install a SIGCHLD handler in your
> >>> Python
> >>> script.  If that's possible, that should tell you when your child exits.
> >>
> >>
> >> Perhaps the Python script is a red herring.  I just wrote it to
> >> demonstrate the problem.  The difference between where I send stdout
> >> to is strange, but you're likely right that it just comes down to
> >> subtle timing differences.  Here's a C program that demonstrates the
> >> same issue more reliably.  Interestingly, it works when I run it in
> >> strace (probably just because of the strace overhead) but not when I
> >> run it normally.
> >>
> >> My point in all this is I'm confused why Cygwin would give up its
> >> handles to the Windows process before wait() has been called.
> >>
> >> (In fact, it's pretty confusing to have fopen returning EINVAL which
> >> according to [1] it should only be doing if the mode string were
> >> invalid.)
> >>
> >> Thanks,
> >> Erik
> >>
> >> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/fopen.html
> >
> >
> > O.K., you may be on to something amiss in the Cygwin DLL.  Thanks for the
> > STC in C; that'll help somebody looking further at this.  I'm out of ideas.
> > It might be possible to reduce strace overhead somewhat by selecting a
> > smaller set of trace options than the default.
>
> Note: My previous test program had a bug in do_child() (not correctly
> terminating the argv array).  The attached program fixes the bug.
> I've also attached a (truncated) strace log from it.

With apologies for re-raising a 2 year old thread; I've finally been
back to working on my port of psutil [1].  I was getting some
confusing errors reading the /proc/[pid]/stat files of recently
created processes that had quickly become zombified.  I had completely
forgotten about this issue until I saw that trying to read the stat
file was resulting in EINVAL ("invalid argument") and something about
that ringed a bell.

So, I can confirm that this is still an issue.  Apparently I wrote
that I had a patch to Cygwin for this.  I have no idea where that
patch is but I'll look for it, or try to reproduce it.  I think the
idea for the patch was to at least make a zombie process's stat file
readable so that the status flag ("Z") can be read, and maybe fill the
remaining fields with 0.

Once I find and/or reproduce that patch I'll submit it to cygwin-patches.


[1] https://psutil.readthedocs.io/en/latest/

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019