delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2006/03/30/20:42:30

X-Spam-Check-By: sourceware.org
Date: Thu, 30 Mar 2006 20:42:18 -0500 (EST)
From: Igor Peshansky <pechtcha AT cs DOT nyu DOT edu>
Reply-To: cygwin AT cygwin DOT com
To: David Carter <carter AT pondol DOT com>
cc: cygwin AT cygwin DOT com
Subject: Re: problems with gawk 3.1.5-3 hanging -- more info
In-Reply-To: <442C7B8F.9070000@pondol.com>
Message-ID: <Pine.GSO.4.63.0603302025110.27530@access1.cims.nyu.edu>
References: <442C25D0 DOT 7030605 AT pondol DOT com> <442C3197 DOT 7090309 AT pondol DOT com> <20060330200757 DOT GO20907 AT calimero DOT vinschen DOT de> <442C408B DOT 3080409 AT carter DOT to> <Pine DOT GSO DOT 4 DOT 63 DOT 0603301640430 DOT 16543 AT access1 DOT cims DOT nyu DOT edu> <442C7B8F DOT 9070000 AT pondol DOT com>
MIME-Version: 1.0
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Thu, 30 Mar 2006, David Carter wrote:

> Igor Peshansky wrote:
> > On Thu, 30 Mar 2006, David Carter wrote:
> > > It appears to me that by opening the file as O_TEXT, that gawk is
> > > hanging because it is waiting for that LF char to follow the CR
> > > (which never comes). Does this sound likely to you?
> >
> > If this theory were true, "echo -ne 'aa\rb' | gawk '{print $0}'" would
> > hang.  It doesn't for me, even with textmode pipes...
>
> Yes, I realized this myself soon after posting. Your echo command
> doesn't hang for me either. As I said in my original post, this is one
> of those annoying bugs that if I try to make it hang interactively, it
> always works correctly (never hangs), but if I try to do it with my
> regular script, it (usually, but not always) hangs.  This is another
> clue that my initial "theory" was incorrect: if it were true, the
> program would hang regardless.
>
> Here's an example line, callable from a prompt, that usually hangs:
>
> $ rsync -Pv sourcefile rmachine:/rpath/ | \
>   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'
>
> To test this, I recommend using a source/remote combination for rsync
> that will take about 30 seconds to a minute to complete. This will
> create enough output for gawk to replicate the issue.
>
> If this hangs (it may not hang the first time; give it 2 or 3 runs),
> you'll stop getting output to stdout and it will just sit there. If you
> go to another prompt to do a ps, you'll see that rsync is done running
> but gawk is still sitting there. CTRL+C in the window running the script
> does nothing. You need to kill the gawk process from another bash
> prompt.
>
> > Try saving the output of rsync to file and running gawk over that
> > separately...
>
> Good idea. Per your advice, I tried doing something like the following:
>
> $ rsync -Pv sourcefile rmachine:/rpath/ > rsync.out

I would at least try "$ rsync -Pv ... | cat > rsync.out", to make sure it
goes through a pipe first.

> $ cat rsync.out | \
>   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

Does "gawk 'BEGIN ...' < rsync.out" hang?

> Surprisingly, that code never hangs. Also, this never hangs:
>
> $ rsync -Pv sourcefile rmachine:/rpath/ | xxd | xxd -r | \
>   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'
>
> However, this usually hangs:
>
> $ rsync -Pv sourcefile rmachine:/rpath/ | cat |
>   gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'

Sounds like it would also not hang if you added "nobinmode" to your CYGWIN
environment variable.

Also, does it help if you use the ASCII values of \r and \n instead (i.e.,
'BEGIN { RS = "\012|\015" } ...')?

> > Also, if gawk really hangs, you can run it under strace to see exactly
> > what it was doing up to the hang (but please don't post the strace
> > output unless you're asked to do so by Corinna or CGF).
>
> I tried something like the following:

I'll let others try to figure the strace output -- no ideas at the
moment...

> [snip]
> All of this makes me wonder if:
>   a) rsync is perhaps doing something with its stdout file descriptor
> that it shouldn't be doing, or that;
>   b) gawk is perhaps doing something with its stdin file descriptor that
> it shouldn't be doing.
>
> If a), then why doesn't it break when I just redirect the output of
> rsync to a file?

Because in one case the input comes from a pipe, and in the other from a
file.  Those are different.

> If b), then what is it about piping the output of rsync to gawk that is
> different (from gawk's point of view) than when I just save the rsync
> output to a file and then send the contents of the file to gawk?

Again, completely different mechanisms are invoked within Cygwin when
reading from a pipe and from a file.

> And another thing...why would any of this make any difference if gawk
> opens the file as O_TEXT vs O_BINARY?

Again, no ideas yet.
	Igor
-- 
				http://cs.nyu.edu/~pechtcha/
      |\      _,,,---,,_	    pechtcha AT cs DOT nyu DOT edu | igor AT watson DOT ibm DOT com
ZZZzz /,`.-'`'    -.  ;-;;,_		Igor Peshansky, Ph.D. (name changed!)
     |,4-  ) )-,_. ,\ (  `'-'		old name: Igor Pechtchanski
    '---''(_/--'  `-'\_) fL	a.k.a JaguaR-R-R-r-r-r-.-.-.  Meow!

"Las! je suis sot... -Mais non, tu ne l'es pas, puisque tu t'en rends compte."
"But no -- you are no fool; you call yourself a fool, there's proof enough in
that!" -- Rostand, "Cyrano de Bergerac"

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019