X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=j2jkSVy4cGdDQiaz Gmuz9efpdGeBTNDopbOxxkcbSiPAVgvxS3U1q8GaNQrx4t/s96OH6iMjY9sKnTEO T4KeBNgjzmZ6lCtZNtGsnzPyjqBH6bfLljW306avP2U1XLx5JUFnU0sFuoOBsgYw sENsnlTzN7ESJ1/AiK9gxlN+YvM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=3kU7vhy0L7AFbYmC5T/e7h G/uYY=; b=hLNhCanpiORvGHdtKYYO/ItrDBcuc+JXwoy1BwhV4kOh/n10edqCI/ saIlg7M7WwXdyDJ4JU/e+Ct/Y/7bXI0kk43svFuAShqOm/4BYmsFok0u0+u+Fyew LleZ5nmMz6EI4sEtGqn0npiZRWtGncSW4VrPBZYl0uMQBRZqcOh/4= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=mystery, Notice, 2524, 1788 X-HELO: sasl.smtp.pobox.com Subject: Re: long I/O delays when strace is running To: cygwin AT cygwin DOT com References: <91DCAC3CB99C724EB365BB64677FBE7B16EBBD AT MX204CL04 DOT corp DOT emc DOT com> <46e39bce-9782-5c74-c196-35ee97ebbc64 AT pobox DOT com> <58F9D314 DOT 2000100 AT maxrnd DOT com> <7d56e728-32a3-9179-b002-e83880e297d0 AT pobox DOT com> <58FA83DB DOT 3000209 AT maxrnd DOT com> <58FC5AFB DOT 3010903 AT maxrnd DOT com> From: Daniel Santos Message-ID: <00fb01d0-acdc-68da-e31e-45ffa7099043@pobox.com> Date: Mon, 24 Apr 2017 00:36:33 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <58FC5AFB.3010903@maxrnd.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 52195DEE-28AF-11E7-A000-C260AE2156B6-06139138!pb-smtp2.pobox.com X-IsSubscribed: yes Well, waiting for GNU/Linux tests to run, so I had a little more time to play with this. On 04/23/2017 02:42 AM, Mark Geisert wrote: > Daniel Santos wrote: >> Well thank you, I wish I had read this earlier. I've been trying to >> debug (with >> gdb) strace (following children) and now I know why the debugger is >> blowing >> through breakpoints, because I may be debugging strace, but strace >> debugging >> it's own child. I added a _pinfo::dump () member function that I'm >> calling from >> pinfo::init () after the test "if (!created && !(flag & PID_NEW))" >> and it's >> crashing because for some processes, the entire _pinfo struct isn't >> mapped in, >> just the first page. Of course, I'm a newbie in this sphere, so >> maybe I've >> missed something and there's sometimes a reason for this? (The first >> part of >> the struct looks normal.) Otherwise, that would be another odd flaw. > > I don't know enough about pinfo.h and pinfo.cc to say anything > definite on this. Well here's one mystery solved. I modified my dump() function to first dump the first page of the struct and then the second, so that when it crashes I can see the contents of the first page at least: _pinfo 0x30000 { pid 2044, process_state 0x00004001, ppid 2044, exitcode 0 cygstarted 0, dwProcessId 0x00000000, progname "", The process_state's bit 0x00004000 happens to be PID_EXECED, so this is intentionally shorter because there's no more information. As for the strange _pinfo object that causes the delay, this appears to be an error in the process database because the windows PID is a duplicate of another process entry. (I modified ps.exe to output the _pinfo::process_state field.) $ strace --trace-children --output=/tmp/ps.log --mask=all ps -el PID PPID PGID WINPID TTY UID STIME STATE COMMAND 948 1788 948 1772 pty0 197608 00:04:22 000000c1 /usr/bin/bash 2720 1 2720 2720 ? 18 00:02:42 00000041 /usr/bin/cygrunsrv 860 0 0 860 pty0 0 00:04:30 00000001 /usr/bin/cygrunsrv 984 1 2168 984 pty0 197608 00:04:30 00000041 /usr/bin/ps 2376 1628 2376 2612 ? 197610 00:02:45 000000c1 /usr/sbin/sshd 2524 2720 2524 1404 ? 18 00:02:42 000000c1 /usr/sbin/cygserver 2168 948 2168 860 pty0 197608 00:04:30 00000061 /usr/bin/strace 1788 2376 1788 2788 ? 197610 00:04:22 000000c1 /usr/sbin/sshd 1628 1 1628 1628 ? 197610 00:02:45 00000041 /usr/bin/cygrunsrv The entry that causes the delay is the one with a zero PPID and PGID and the progname field is wrong -- It isn't "cygrunsrv". It always has the same progname name of the previous process. Notice that the WINPID field has the same value as the PID of the strace process that's real (2168). So my guess is that this object shouldn't even be there. Daniel -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple