X-Spam-Check-By: sourceware.org
Date: Tue, 14 Mar 2006 11:35:17 +0100
From: Corinna Vinschen <corinna-cygwin@cygwin.com>
To: cygwin@cygwin.com
Subject: Re: Bug in POSIX.2 regex word boundary matching
Message-ID: <20060314103517.GD5887@calimero.vinschen.de>
Reply-To: cygwin@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
References: <1142317647.4416624fd617a@imp1-g19.free.fr> <1142318263.441664b7bea01@imp1-g19.free.fr>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1142318263.441664b7bea01@imp1-g19.free.fr>
User-Agent: Mutt/1.4.2i
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
Precedence: bulk
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie.com@cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com

On Mar 14 19:37, dominique.pelle@free.fr wrote:
> Bleah. #include statements were missing in my
> previously posted sample test case.  Here
> is the test case again with #include statements
> this time:
> 
> $ cat regex-bug.c
> 
> #include <stdio.h>
> #include <regex.h>
> #include <stdlib.h>
> 
> int main()
> {
>   regex_t    r;
>   regmatch_t pmatch[2];
> 
>   if (regcomp(&r, "\\bfoobar\\b", REG_EXTENDED) != 0) {
>     fprintf(stderr, "regcomp failed\n");
>     exit(-1);
>   }
> 
>   /* I'd expect above regex to match following string */
>   if (regexec(&r, "test foobar test", 2, pmatch, 0) == 0) {
>     fprintf(stderr, "OK (match)\n");  /* expected behavior */
>   } else {
>     fprintf(stderr, "FAIL (mismatch)\n"); /* unexpected!? */
>   }
>   return 0;
> }
> 
> $ gcc regex-bug.c
> $ ./a.out
> 
> Outcome on Cywgin ................ FAIL (mismatch)
> 
> Outcome on Linux (Ubuntu-5.10) ... OK (match)

Linux uses the glibc GNU regex library, which allows extensions known
from perl, like \b, \w.  Cygwin's regex is Henry Spencer's implementaton
which does not know these extensions.  Note that the POSIX standard
of regular expressions does not contain these extensions, see also
http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

