delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/11/24/17:24:07

X-Recipient: archive-cygwin AT delorie DOT com
X-Spam-Check-By: sourceware.org
Date: Tue, 24 Nov 2009 17:23:48 -0500
From: Christopher Faylor <cgf-use-the-mailinglist-please AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Cygwin bash regexp matching doesn't treat "\b" properly
Message-ID: <20091124222348.GA8598@ednor.casa.cgf.cx>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <26500158 DOT post AT talk DOT nabble DOT com> <26500814 DOT post AT talk DOT nabble DOT com> <4B0C4C2A DOT 3080502 AT gmail DOT com> <26503748 DOT post AT talk DOT nabble DOT com> <loom DOT 20091124T231321-600 AT post DOT gmane DOT org>
MIME-Version: 1.0
In-Reply-To: <loom.20091124T231321-600@post.gmane.org>
User-Agent: Mutt/1.5.20 (2009-06-14)
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Tue, Nov 24, 2009 at 10:18:27PM +0000, Eric Blake wrote:
>aputerguy <nabble <at> kosowsky.org> writes:
>
>> HOWEVER, this solution while sweet for cygwin-bash, has the CONVERSE
>> PROBLEM.
>> Apparently, the special strings [[:<:]] and [[:>:]] are not recognized under
>> Linux regex(7) - they give return code 2.
>
>And why is that surprising?  'man 7 regex' _did_ state:
>
> There  are  two  special  cases-  of  bracket  expressions: the bracket
>       expressions `[[:<:]]' and `[[:>:]]' match the null string at the begin-
>       ning  and  end of a word respectively.  A word is defined as a sequence
>       of word characters which is neither preceded nor followed by word char-
>       acters.   A  word  character  is  an  alnum  character  (as  defined by
>       ctype(3)) or an underscore.  This is an extension, compatible with  but
>       not specified by POSIX 1003.2, and should be used with caution in soft-
>       ware intended to be portable to other systems.
>
>> 
>> So, now I have the frustrating situation where \\b works in Linux but not in
>> Cygwin while [[:<:]] works in Cygwin but not in Linux.
>
>So, in true open source fashion, why not write a patch that teaches cygwin's 
>regex(3) implementation that \b is a synonym to [[:<:][:>:]]?
>
>I, for one, would readily accept such a patch.  But it hasn't yet crept high 
>enough on my personal itch list for me to spend the time writing it.
>
>Or, from a capitalistic viewpoint, is there anyone out there willing to pay for 
>my time to write the patch on their behalf?  However, please be careful in how 
>you respond to this offer (that is, this is one time where private email makes 
>more sense to settle on a fair price, rather than advertising the entire 
>transaction on the publicly archived cygwin list, if only so that what I 
>consider a fair price does not set an unreasonable precedence for what someone 
>else considers a fair price).

If anyone does this they should remember that Cygwin's regex is based on
freebsd and making major changes to it is not something that we'd
lightly consider since that causes potential merge conflicts later.

cgf

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019