delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2003/01/15/19:48:44

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
X-Authentication-Warning: slinky.cs.nyu.edu: pechtcha owned process doing -bs
Date: Wed, 15 Jan 2003 19:48:32 -0500 (EST)
From: Igor Pechtchanski <pechtcha AT cs DOT nyu DOT edu>
Reply-To: cygwin AT cygwin DOT com
To: Stacey Sheldon <ssheldon AT catena DOT com>
cc: cygwin AT cygwin DOT com
Subject: Re: 1.3.18: BUG: Piping DOS files to grep (v2.5) doesn't work
properly
In-Reply-To: <231417CB271FD61197020002A593077FC4BEB3@cat01s2c.catena.com>
Message-ID: <Pine.GSO.4.44.0301151946050.10883-100000@slinky.cs.nyu.edu>
Importance: Normal
MIME-Version: 1.0

On Wed, 15 Jan 2003, Stacey Sheldon wrote:

> Mailing list search didn't find this, nor does it appear
> in the FAQ... hopefully this isn't old news to all of you.
>
> Files read from a pipe are treated differently by grep
> than files read directly.  This results in some unexpected
> (by me) behaviour when using grep on files which use
> the a DOS line-end (cr/nl).  This looks like a bug to me.
>
> I'd expect the following commands to have equivalent
> results:
>
>   grep myregex blah
>   grep myregex < blah
>   cat blah | grep myregex
>
> They are equivalent when the regular file blah uses
> Unix line ends, but they differ for a file blahdos which
> uses DOS line ends.  It appears to me as though grep
> is treating its input as binary when reading from a pipe,
> but correctly using "undossify_input()" in other cases.
>
> Here is an example.  I've created two files, blah (nl line-end)
> and blahdos (cr/nl line-end).
>
>    $ cat blah
>    foobarTest
>    $ od -Ax -a blah
>    000000   f   o   o   b   a   r   T   e   s   t  nl
>    00000b
>    $ od -Ax -a blahdos
>    000000   f   o   o   b   a   r   T   e   s   t  cr  nl
>    00000c
>
> These files should match the regex 'Test$' in all cases,
> but grep on blahdos fails for this case:
>
>    $ cat blahdos | grep 'Test$'
>    $
>
> And here's why (not the -v to invert the match so we have
> something to look at):
>
>    $ cat blahdos | grep -v 'Test$' | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  cr  nl
>    00000c
>
> There's still a cr/nl on the output which wouldn't be there if
> grep had interpreted its input as having DOS line ends.  Here's
> what a successful grep of the UNIX line end file looks like:
>
>    $ cat blah | grep 'Test$' | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  nl
>    00000b
>
> In fact, if I read the blahdos file in any other way except through
> a pipe, it successfully matches (note the stripped out cr on the output):
>
>    $ grep 'Test$' blahdos | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  nl
>    00000b
>    $ grep 'Test$' < blahdos | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  nl
>    00000b
>
> Just in case you might think that this has something to do with cat
> (I did), here's the output of cat for each file:
>
>    $ cat blah | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  nl
>    00000b
>    $ cat blahdos | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  cr  nl
>    00000c
>
> Using head instead of cat gives the same results as well, just to
> completely remove cat from the picture.
>
> I'm currently running these versions of tools on win2k:
>   cygwin     1.3.18-1
>   textutils  2.0.21 (cat, od, head)
>   grep       2.5
>   bash       2.05b.0(8)-release
>
> I also tried this out with cygwin 1.3.17-1 with identical results.
>
> If you need any further information, please cc me directly since I
> don't read the mailing lists very often.
>
> Stacey.

Stacey,

This is not a bug.  This is expected behavior.  For details, read
<http://cygwin.com/cygwin-ug-net/using-cygwinenv.html>.
	Igor
-- 
				http://cs.nyu.edu/~pechtcha/
      |\      _,,,---,,_		pechtcha AT cs DOT nyu DOT edu
ZZZzz /,`.-'`'    -.  ;-;;,_		igor AT watson DOT ibm DOT com
     |,4-  ) )-,_. ,\ (  `'-'		Igor Pechtchanski
    '---''(_/--'  `-'\_) fL	a.k.a JaguaR-R-R-r-r-r-.-.-.  Meow!

Oh, boy, virtual memory! Now I'm gonna make myself a really *big* RAMdisk!
  -- /usr/games/fortune


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019