X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Tue, 24 Nov 2009 17:23:48 -0500 From: Christopher Faylor To: cygwin AT cygwin DOT com Subject: Re: Cygwin bash regexp matching doesn't treat "\b" properly Message-ID: <20091124222348.GA8598@ednor.casa.cgf.cx> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: <26500158 DOT post AT talk DOT nabble DOT com> <26500814 DOT post AT talk DOT nabble DOT com> <4B0C4C2A DOT 3080502 AT gmail DOT com> <26503748 DOT post AT talk DOT nabble DOT com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Tue, Nov 24, 2009 at 10:18:27PM +0000, Eric Blake wrote: >aputerguy kosowsky.org> writes: > >> HOWEVER, this solution while sweet for cygwin-bash, has the CONVERSE >> PROBLEM. >> Apparently, the special strings [[:<:]] and [[:>:]] are not recognized under >> Linux regex(7) - they give return code 2. > >And why is that surprising? 'man 7 regex' _did_ state: > > There are two special cases- of bracket expressions: the bracket > expressions `[[:<:]]' and `[[:>:]]' match the null string at the begin- > ning and end of a word respectively. A word is defined as a sequence > of word characters which is neither preceded nor followed by word char- > acters. A word character is an alnum character (as defined by > ctype(3)) or an underscore. This is an extension, compatible with but > not specified by POSIX 1003.2, and should be used with caution in soft- > ware intended to be portable to other systems. > >> >> So, now I have the frustrating situation where \\b works in Linux but not in >> Cygwin while [[:<:]] works in Cygwin but not in Linux. > >So, in true open source fashion, why not write a patch that teaches cygwin's >regex(3) implementation that \b is a synonym to [[:<:][:>:]]? > >I, for one, would readily accept such a patch. But it hasn't yet crept high >enough on my personal itch list for me to spend the time writing it. > >Or, from a capitalistic viewpoint, is there anyone out there willing to pay for >my time to write the patch on their behalf? However, please be careful in how >you respond to this offer (that is, this is one time where private email makes >more sense to settle on a fair price, rather than advertising the entire >transaction on the publicly archived cygwin list, if only so that what I >consider a fair price does not set an unreasonable precedence for what someone >else considers a fair price). If anyone does this they should remember that Cygwin's regex is based on freebsd and making major changes to it is not something that we'd lightly consider since that causes potential merge conflicts later. cgf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple