Mail Archives: cygwin/2006/03/30/15:33:34
Corinna Vinschen wrote:
> O_TEXT is correct because gawk is a text tool in the first place and
> it should treat input lines identical, regardless if they have DOS
> or UNIX lineendings.
Hi Corinna, thanks for the prompt reply.
If I understand you correctly, the fix in -3 has to do with converting
DOS-style CRLFs to LFs. This appears to be the issue. The ouput from
rsync (on all platforms--windows/unix/POSIX/whatever) contains CR
characters (0x0d) by themselves. This is what accounts for the output of
rsync "overwriting" itself when you run it alone from a bash prompt.
Here's a snippet of hexdump output from rsync:
$ rsync -Pv /cygdrive/c/backup2 10.0.0.204:~ | xxd
0000000: 6261 636b 7570 320a 2020 2020 2020 2020 backup2.
0000010: 2037 3030 2020 2030 2520 2020 2030 2e30 700 0% 0.0
0000020: 306b 422f 7320 2020 2030 3a30 303a 3030 0kB/s 0:00:00
0000030: 0d20 2020 2020 3133 3736 3137 3620 2020 . 1376176
0000040: 3025 2020 2020 312e 3238 4d42 2f73 2020 0% 1.28MB/s
0000050: 2020 303a 3133 3a33 350d 2020 2020 2032 0:13:35. 2
You can see the 0d all by itself at address 0000030, and again at 0000059.
It appears to me that by opening the file as O_TEXT, that gawk is
hanging because it is waiting for that LF char to follow the CR (which
never comes). Does this sound likely to you?
> I can't tell why it fails for you, because I can't reproduce this
> locally.
I'm working on a short script that reproduces the problem for all
parties; I'll post it here when I have it. Or would you rather I send it
directly to you?
Also, I took a look at some of the source for other utilites that work
with text input; these included tail, head, cat, and sed. I don't see
any of those utilities opening up the input file the way you are in
gawk, and in fact a look at the ChangeLog for coreutils hints that they
used setmode at one time and since removed it (why, I don't know).
Comments abound like this in the ChangeLog:
ChangeLog: * src/cat.c (main): Avoid setmode; use POSIX-specified
routines instead.
My thinking was, "gawk should probably open files the same way sed
does," but maybe my thinking is in error on this point. Your thoughts?
> As for the O_BINARY mode, in theory there's a way to
> accomplish that without rebuilding gawk by setting the BINMODE
> variable:
>
> gawk -v BINMODE=r [...]
>
> Unfortunately it turns out that this doesn't work because gawk fails
> to call the setmode function in this case on Cygwin. I'll upload a
> patched gawk soon. If you want to apply it by yourself, try this:
> (snip...)
This is a suitable workaround for me, but I would like to humbly submit
that gawk shouldn't hang regardless of the input given to it. If the
input isn't acceptable, perhaps it should error to stderr or some such
and exit. Your thoughts?
Again, I'll come up with a short shell script that reproduces the issue
for you, and hopefully together we can come up with an agreeable solution.
Regards;
David Carter
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -