delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/1998/06/14/06:28:36

Date: Sun, 14 Jun 1998 13:26:41 +0300 (IDT)
From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
To: DJ Delorie <dj AT delorie DOT com>
cc: djgpp-workers AT delorie DOT com
Subject: Posix regexps
Message-ID: <Pine.SUN.3.91.980614131050.6294D-100000@is>
MIME-Version: 1.0

I'm participating in the pretest of GNU Sed 3.01 (which hopefully will 
include DJGPP support in the official distribution).  As part of the 
test, I've linked it with the Posix regexp functions in our libc.a 
instead of GNU regexp library (this makes Sed 4 times faster :-).  But 
then it failed some of the test scripts from the test suite. 

It turns out that all of the failures use regular expressions that,
according to the docs of our regexp funxtions, are undefined by Posix.  
Here are some examples:
			*a
			^*
			(*)
			a**
			(a|)

These (and other) regexps are all fed to Sed with the -r switch, which 
switches the regexp syntax to the extended one, as opposed to the basic 
syntax used by default.  (When given the -r switch, Sed uses the 
REG_EXTENDED flag when calling the regexp functions.)

Or regexp functions return an error "bad regexp" for all the cases 
above (which I kinda understand), whereas GNU regexp has no problems with 
them.

Before I go out and yell at GNU people for testing non-standard features 
without notice, could somebody please look at the latest Posix standard 
regarding basic and extended regexps, and tell whether the above are 
indeed undefined? 

Another case which fails is when the regexp includes a backreference, 
like in "(....).*\1".  My references indicate that backreferences are not 
supported in extended regexps (only in basic ones), but I'd like a 
confirmation, please.

Thanks in advance for any help.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019