delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2006/03/30/19:45:19

X-Spam-Check-By: sourceware.org
Message-ID: <442C7B8F.9070000@pondol.com>
Date: Thu, 30 Mar 2006 18:45:03 -0600
From: David Carter <carter AT pondol DOT com>
User-Agent: Thunderbird 1.5 (Windows/20051201)
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: problems with gawk 3.1.5-3 hanging -- more info
References: <442C25D0 DOT 7030605 AT pondol DOT com> <442C3197 DOT 7090309 AT pondol DOT com> <20060330200757 DOT GO20907 AT calimero DOT vinschen DOT de> <442C408B DOT 3080409 AT carter DOT to> <Pine DOT GSO DOT 4 DOT 63 DOT 0603301640430 DOT 16543 AT access1 DOT cims DOT nyu DOT edu>
In-Reply-To: <Pine.GSO.4.63.0603301640430.16543@access1.cims.nyu.edu>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

Igor Peshansky wrote:
> On Thu, 30 Mar 2006, David Carter wrote:
>> It appears to me that by opening the file as O_TEXT, that gawk is
>> hanging because it is waiting for that LF char to follow the CR (which
>> never comes). Does this sound likely to you?
> 
> If this theory were true, "echo -ne 'aa\rb' | gawk '{print $0}'" would
> hang.  It doesn't for me, even with textmode pipes...

Yes, I realized this myself soon after posting. Your echo command 
doesn't hang for me either. As I said in my original post, this is one 
of those annoying bugs that if I try to make it hang interactively, it 
always works correctly (never hangs), but if I try to do it with my 
regular script, it (usually, but not always) hangs.  This is another 
clue that my initial "theory" was incorrect: if it were true, the 
program would hang regardless.

Here's an example line, callable from a prompt, that usually hangs:

$ rsync -Pv sourcefile rmachine:/rpath/ | \
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

To test this, I recommend using a source/remote combination for rsync 
that will take about 30 seconds to a minute to complete. This will 
create enough output for gawk to replicate the issue.

If this hangs (it may not hang the first time; give it 2 or 3 runs), 
you'll stop getting output to stdout and it will just sit there. If you 
go to another prompt to do a ps, you'll see that rsync is done running 
but gawk is still sitting there. CTRL+C in the window running the script 
does nothing. You need to kill the gawk process from another bash prompt.

> Try saving the output of rsync to file and running gawk over that
> separately...  

Good idea. Per your advice, I tried doing something like the following:

$ rsync -Pv sourcefile rmachine:/rpath/ > rsync.out
$ cat rsync.out | \
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

Surprisingly, that code never hangs. Also, this never hangs:

$ rsync -Pv sourcefile rmachine:/rpath/ | xxd | xxd -r | \
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

However, this usually hangs:

$ rsync -Pv sourcefile rmachine:/rpath/ | cat |
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

> Also, if gawk really hangs, you can run it under strace to
> see exactly what it was doing up to the hang (but please don't post the
> strace output unless you're asked to do so by Corinna or CGF).

I tried something like the following:

$ rsync -Pv sourcefile rmachine:/rpath/ | strace \
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

But, unfortunately, this never hangs. So I tried this:

$ ( sleep 10; rsync -Pv sourcefile rmachine:/rpath/ ) | \
   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

and then I go to another window and start strace on the gawk PID. This 
hangs (usually). Looking at the strace output, the last thing gawk does is:

    87 22612601 [read_pipe] gawk 188 fhandler_base::read: returning 1, 
text mode

Every time it hangs, I get "read returning 1, text mode". If I look at 
strace output for the sucessful (non-hanging) executions, i never get a 
"read returning 1, text mode."

All of this makes me wonder if:
   a) rsync is perhaps doing something with its stdout file descriptor 
that it shouldn't be doing, or that;
   b) gawk is perhaps doing something with its stdin file descriptor 
that it shouldn't be doing.

If a), then why doesn't it break when I just redirect the output of 
rsync to a file? If b), then what is it about piping the output of rsync 
to gawk that is different (from gawk's point of view) than when I just 
save the rsync output to a file and then send the contents of the file 
to gawk?

And another thing...why would any of this make any difference if gawk 
opens the file as O_TEXT vs O_BINARY?

> HTH,

It was a great help. Thanks, Igor. Any other light you can shed is much 
appreciated.

Regards;

David Carter

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019