delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2005/01/30/22:27:32

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Date: Sun, 30 Jan 2005 22:26:55 -0500 (EST)
From: Igor Pechtchanski <pechtcha AT cs DOT nyu DOT edu>
Reply-To: cygwin AT cygwin DOT com
To: Luke Kendall <luke DOT kendall AT cisra DOT canon DOT com DOT au>
cc: cygwin AT cygwin DOT com
Subject: Re: Updated: sed-4.1.3-1
In-Reply-To: <20050130223445.729458570A@pessard.research.canon.com.au>
Message-ID: <Pine.GSO.4.61.0501302212410.27388@slinky.cs.nyu.edu>
References: <20050130223445 DOT 729458570A AT pessard DOT research DOT canon DOT com DOT au>
MIME-Version: 1.0

On Mon, 31 Jan 2005, Luke Kendall wrote:

> On 29 Jan, Corinna Vinschen wrote:
> >  * regex addresses do not use leftmost-longest matching.  In other words,
> >    /.\+/ only looks for a single character, and does not try to find as
> >    many of them as possible like it used to do.
>
> Interesting: does that mean every existing script that relied on the old
> behaviour must change?  I'm glad I stuck with the old "/..*/" notation
> when I wanted one or more repetitions!

I believe you are confused here.  Yes, every script that *relied* on the
old behavior will have to change, but the number of those is vastly
smaller than you seem to think.  Very few scripts actually rely on this;
the only ones that will behave differently are scripts like

	sed -e '/^\(.\+\)/s//---\1---/'

where the regex address pattern is saved and used in the subsequent
replacement (and is not anchored on the right side).  The above script
will turn "abcde" into "---a---bcde" with the new behavior, and
"---abcde---" with the old one.  Note that the pattern has to be
unanchored on the right for the behavior to change; the behavior of

	sed -e '/^\(.\+\)$/s//---\1---/'

should stay the same.  BTW, the latter script *is* the way to fix for the
former (they were equivalent under the old behavior).

> So \+ now works the opposite of * (\+ = shortest, * = longest)?  And .\+
> is now a synonym for a single "."?  So, why would you use .\+?

No, .\+ still means "one or more".  It's just when you say

	sed -e '/^abc.\+/d'

to delete all lines that start with "abc", sed will no longer have to go
through the whole line to determine that it starts with "abc" (as it used
to).  Note that the above was a pretty silly way of writing this anyway,
as '/^abc./d' would have sufficed.

> Ah, I see, it's a way of matching zero or one occurrences.  I would have
> thought a new symbol would have made more sense for the new semantics,
> so as to preserve backward compatibility.
>
> Probably I've misunderstood.

I believe so.  Unless I, too, am totally confused.
	Igor
-- 
				http://cs.nyu.edu/~pechtcha/
      |\      _,,,---,,_		pechtcha AT cs DOT nyu DOT edu
ZZZzz /,`.-'`'    -.  ;-;;,_		igor AT watson DOT ibm DOT com
     |,4-  ) )-,_. ,\ (  `'-'		Igor Pechtchanski, Ph.D.
    '---''(_/--'  `-'\_) fL	a.k.a JaguaR-R-R-r-r-r-.-.-.  Meow!

"The Sun will pass between the Earth and the Moon tonight for a total
Lunar eclipse..." -- WCBS Radio Newsbrief, Oct 27 2004, 12:01 pm EDT

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019