Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com To: cygwin AT cygwin DOT com From: Sam Steingold Subject: Re: cygwin/regex is non-POSIX Date: Mon, 19 Jan 2004 13:54:06 -0500 Organization: disorganization Lines: 41 Message-ID: References: <20040118050449 DOT GA3672 AT efn DOT org> Reply-To: sds AT gnu DOT org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Complaints-To: usenet AT sea DOT gmane DOT org X-Attribution: Sam X-Disclaimer: You should not expect anyone to agree with me. User-Agent: Gnus/5.1003 (Gnus v5.10.3) Emacs/21.3.50 (windows-nt) > * Yitzchak Scott-Thoennes [2004-01-17 21:04:50 -0800]: > > On Thu, Jan 15, 2004 at 03:14:57PM -0500, Sam Steingold wrote: >> the cygwin regex is not POSIX. >> backrefs are not available by default (apparently you need REG_BACKR for >> that), "(a|)*" cannot be compiled because of "empty (sub)expression", >> &c &c. > > SUSv3 says: > A vertical-line appearing first or last in an ERE, or immediately > following a vertical-line or a left-parenthesis, or immediately > preceding a right-parenthesis, produces undefined results. Thanks, you are right, complaint is withdrawn. > Also, it says backrefs part of basic regular expressions but not > exteneded ones. From your mention of | I assume you are using > REG_EXTENDED. If REG_EXTENDED|REG_BACKR allows backrefs, it doesn't > appear to be documented. I am not sure what you mean here. I would like to interpret your words as follows, so that I can agree with you: does not mention REG_BACKR, so it's mere presence can probably be contrued as a violation of the standard (unless it is enabled whenever REG_EXTENDED is). REG_BACKR is also not mentioned in "man regex", so it is not documented. Right? Now, whether I add REG_BACKR to cflags (together with REG_EXTENDED) or eflags, I do not get back references: "^(x)+\\1$" does not match "xx" (should be "xx" for whole and "x" for the group). Finally, a common extension appears to be the use or "?" after a repetition specification to mean non-greedy matching, e.g. "a+?" will match only the first "a" in "aaaa". -- Sam Steingold (http://www.podval.org/~sds) running w2k When C++ is your hammer, everything looks like a thumb. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/