delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2017/04/24/03:01:23

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:message-id:date
:mime-version:in-reply-to:content-type
:content-transfer-encoding; q=dns; s=default; b=VPHMfQBfIucchRCb
itgoG/E2ZdVB87kHp0fk0ueSmSy+sP1lcRAc2czQQtTm6L85iT2fiZUGwGy07jO0
sfhRdCqV9n6LEaJhi18a3ZwAmywxXWbn/i6NLlFW7Iv45GNM1QLTVWz3d0QKbBS0
K0f3sB0/rl3LBCArcMtXL3VNNoU=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:message-id:date
:mime-version:in-reply-to:content-type
:content-transfer-encoding; s=default; bh=w1AP+2YPz4q8rD0HmSKb03
uZaEY=; b=qOBsOGLEcvwgriyY4uwcNacUHGeCFFBf31/NWYP3BJOJ6OSRS9m/ls
MT91Y0prrvpDXdPf1AyA/AY4eXZG9VTwHYHBCwOoLt4Vy2o6xNk8/PHM9+JLUwe3
Tr+auH6LueZ8f6wf7KSSzEWsIR6d373kNPQpcYgsty6V02YC0FSqs=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=H*r:8.12.11, H*F:U*mark, work, work!
X-HELO: m0.truegem.net
Subject: Re: long I/O delays when strace is running
To: cygwin AT cygwin DOT com
References: <be8713ec-5b68-e0e9-c5a5-d48fce491a2d AT pobox DOT com> <91DCAC3CB99C724EB365BB64677FBE7B16EBBD AT MX204CL04 DOT corp DOT emc DOT com> <46e39bce-9782-5c74-c196-35ee97ebbc64 AT pobox DOT com> <ace81c86-80d5-bef9-6b93-7d36dde5a4e8 AT pobox DOT com> <bff9c5ea-85d9-bf4e-2f25-271c1490565f AT pobox DOT com> <58F9D314 DOT 2000100 AT maxrnd DOT com> <7d56e728-32a3-9179-b002-e83880e297d0 AT pobox DOT com> <58FA83DB DOT 3000209 AT maxrnd DOT com> <a4afe377-3b4f-8373-8116-e812a8d82fbe AT pobox DOT com> <58FC5AFB DOT 3010903 AT maxrnd DOT com> <00fb01d0-acdc-68da-e31e-45ffa7099043 AT pobox DOT com>
From: Mark Geisert <mark AT maxrnd DOT com>
Message-ID: <58FDA29D.7080908@maxrnd.com>
Date: Mon, 24 Apr 2017 00:00:45 -0700
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0 SeaMonkey/2.40
MIME-Version: 1.0
In-Reply-To: <00fb01d0-acdc-68da-e31e-45ffa7099043@pobox.com>

Daniel Santos wrote:
> Well, waiting for GNU/Linux tests to run, so I had a little more time to play
> with this.
>
> On 04/23/2017 02:42 AM, Mark Geisert wrote:
>> Daniel Santos wrote:
>>> Well thank you, I wish I had read this earlier. I've been trying to debug (with
>>> gdb) strace (following children) and now I know why the debugger is blowing
>>> through breakpoints, because I may be debugging strace, but strace debugging
>>> it's own child.  I added a _pinfo::dump () member function that I'm calling from
>>> pinfo::init () after the test "if (!created && !(flag & PID_NEW))" and it's
>>> crashing because for some processes, the entire _pinfo struct isn't mapped in,
>>> just the first page.  Of course, I'm a newbie in this sphere, so maybe I've
>>> missed something and there's sometimes a reason for this?  (The first part of
>>> the struct looks normal.) Otherwise, that would be another odd flaw.
>>
>> I don't know enough about pinfo.h and pinfo.cc to say anything definite on this.
>
> Well here's one mystery solved.  I modified my dump() function to first dump the
> first page of the struct and then the second, so that when it crashes I can see
> the contents of the first page at least:
>
> _pinfo 0x30000 {
>    pid 2044,
>    process_state 0x00004001,
>    ppid 2044,
>    exitcode 0
>    cygstarted 0,
>    dwProcessId 0x00000000,
>    progname "",
>
> The process_state's bit 0x00004000 happens to be PID_EXECED, so this is
> intentionally shorter because there's no more information.
>
> As for the strange _pinfo object that causes the delay, this appears to be an
> error in the process database because the windows PID is a duplicate of another
> process entry. (I modified ps.exe to output the _pinfo::process_state field.)
>
> $ strace --trace-children --output=/tmp/ps.log --mask=all ps -el
>        PID    PPID    PGID     WINPID   TTY         UID STIME  STATE COMMAND
>        948    1788     948       1772  pty0      197608 00:04:22 000000c1
> /usr/bin/bash
>       2720       1    2720       2720  ?             18 00:02:42 00000041
> /usr/bin/cygrunsrv
>        860       0       0        860  pty0           0 00:04:30 00000001
> /usr/bin/cygrunsrv
>        984       1    2168        984  pty0      197608 00:04:30 00000041
> /usr/bin/ps
>       2376    1628    2376       2612  ?         197610 00:02:45 000000c1
> /usr/sbin/sshd
>       2524    2720    2524       1404  ?             18 00:02:42 000000c1
> /usr/sbin/cygserver
>       2168     948    2168        860  pty0      197608 00:04:30 00000061
> /usr/bin/strace
>       1788    2376    1788       2788  ?         197610 00:04:22 000000c1
> /usr/sbin/sshd
>       1628       1    1628       1628  ?         197610 00:02:45 00000041
> /usr/bin/cygrunsrv
>
> The entry that causes the delay is the one with a zero PPID and PGID and the
> progname field is wrong --  It isn't "cygrunsrv".  It always has the same
> progname name of the previous process.  Notice that the WINPID field has the
> same value as the PID of the strace process that's real (2168).
>
> So my guess is that this object shouldn't even be there.

Excellent debugging work!  I'm inclined to agree with your last point.  I'm 
poring over pinfo.cc as well as dcrt0.cc, which is the Cygwin DLL init code. 
The latter talks about special cases if the DLL is runtime loaded (like strace 
does) vs link-time loaded (like Cygwin apps do).  May take me a while to respond.

..mark


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019