delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/03/04/08:54:57

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SPF_PASS
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
Subject: RE: grep -P regexp problem
Date: Wed, 4 Mar 2009 13:54:29 -0000
Message-ID: <5E25AF06EFB9EA4A87C19BC98F5C875302B428C2@core-email.int.ascribe.com>
In-Reply-To: <BAY127-DS374577AD66B9A49AF63E9A6A60@phx.gbl>
References: <BAY127-DS374577AD66B9A49AF63E9A6A60 AT phx DOT gbl>
From: "Phil Betts" <Phil DOT Betts AT ascribe DOT com>
To: <cygwin AT cygwin DOT com>
Reply-To: <cygwin AT cygwin DOT com>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

Andriy Sen wrote:
> Below is an example of the problem.
>
> G:\>cat test.s
> a
> 1
>=20
> G:\>cat test.s | grep -P "[^0]1"
> a
> 1

This is not cygwin-specific, so it is really OT for this list, that
being said...

grep -P treats the whole input as a single string, and outputs the
line (or lines) containing the match for the pattern.  [^0] matches=20
ANYTHING except 0, including linefeeds.

In your case, the [^0] is matching the linefeed preceding the 1.  That
linefeed is considered part of the line "a\n", so that line is
included in the output.  In other words, although it looks like there
are two matches output, in fact there is only one match, and that is
"a\n1\n"

Assuming you wish to match single lines containing a character other=20
than 0 followed by a 1, you probably want the pattern to be '[^0\n]1'

It's probably a bit clearer if the test file is a bit bigger:

$ echo -e 'a\n1\n2\n3\n4\n1\n2\n21\n' > test.txt
$ grep -P '[^0]1' test.txt=20
a
1
4
1
21

This output contains 3 matches "a\n1\n" "4\n1\n" and "21\n", whereas:

$ grep -P '[^0\n]1' test.txt=20
21

only matches single lines with a 1 that follows anything but 0.


Phil
--=20


This email has been scanned by Ascribe PLC using Microsoft Antigen for Exch=
ange.

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019