Mail Archives: cygwin/2003/01/15/19:48:44
On Wed, 15 Jan 2003, Stacey Sheldon wrote:
> Mailing list search didn't find this, nor does it appear
> in the FAQ... hopefully this isn't old news to all of you.
>
> Files read from a pipe are treated differently by grep
> than files read directly. This results in some unexpected
> (by me) behaviour when using grep on files which use
> the a DOS line-end (cr/nl). This looks like a bug to me.
>
> I'd expect the following commands to have equivalent
> results:
>
> grep myregex blah
> grep myregex < blah
> cat blah | grep myregex
>
> They are equivalent when the regular file blah uses
> Unix line ends, but they differ for a file blahdos which
> uses DOS line ends. It appears to me as though grep
> is treating its input as binary when reading from a pipe,
> but correctly using "undossify_input()" in other cases.
>
> Here is an example. I've created two files, blah (nl line-end)
> and blahdos (cr/nl line-end).
>
> $ cat blah
> foobarTest
> $ od -Ax -a blah
> 000000 f o o b a r T e s t nl
> 00000b
> $ od -Ax -a blahdos
> 000000 f o o b a r T e s t cr nl
> 00000c
>
> These files should match the regex 'Test$' in all cases,
> but grep on blahdos fails for this case:
>
> $ cat blahdos | grep 'Test$'
> $
>
> And here's why (not the -v to invert the match so we have
> something to look at):
>
> $ cat blahdos | grep -v 'Test$' | od -Ax -a
> 000000 f o o b a r T e s t cr nl
> 00000c
>
> There's still a cr/nl on the output which wouldn't be there if
> grep had interpreted its input as having DOS line ends. Here's
> what a successful grep of the UNIX line end file looks like:
>
> $ cat blah | grep 'Test$' | od -Ax -a
> 000000 f o o b a r T e s t nl
> 00000b
>
> In fact, if I read the blahdos file in any other way except through
> a pipe, it successfully matches (note the stripped out cr on the output):
>
> $ grep 'Test$' blahdos | od -Ax -a
> 000000 f o o b a r T e s t nl
> 00000b
> $ grep 'Test$' < blahdos | od -Ax -a
> 000000 f o o b a r T e s t nl
> 00000b
>
> Just in case you might think that this has something to do with cat
> (I did), here's the output of cat for each file:
>
> $ cat blah | od -Ax -a
> 000000 f o o b a r T e s t nl
> 00000b
> $ cat blahdos | od -Ax -a
> 000000 f o o b a r T e s t cr nl
> 00000c
>
> Using head instead of cat gives the same results as well, just to
> completely remove cat from the picture.
>
> I'm currently running these versions of tools on win2k:
> cygwin 1.3.18-1
> textutils 2.0.21 (cat, od, head)
> grep 2.5
> bash 2.05b.0(8)-release
>
> I also tried this out with cygwin 1.3.17-1 with identical results.
>
> If you need any further information, please cc me directly since I
> don't read the mailing lists very often.
>
> Stacey.
Stacey,
This is not a bug. This is expected behavior. For details, read
<http://cygwin.com/cygwin-ug-net/using-cygwinenv.html>.
Igor
--
http://cs.nyu.edu/~pechtcha/
|\ _,,,---,,_ pechtcha AT cs DOT nyu DOT edu
ZZZzz /,`.-'`' -. ;-;;,_ igor AT watson DOT ibm DOT com
|,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski
'---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow!
Oh, boy, virtual memory! Now I'm gonna make myself a really *big* RAMdisk!
-- /usr/games/fortune
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -