delorie.com/archives/browse.cgi | search |
> From: "Zack Weinberg" <zackw AT stanford DOT edu> > Date: Fri, 8 Jun 2001 09:59:32 -0700 > > On Fri, Jun 08, 2001 at 10:06:51AM +0300, Eli Zaretskii wrote: > > > > One notorious problem with GNU regex is that it is quite slow for many > > simple jobs, such as matching a simple regular expression with no > > backtracking. It seems that the main reason for this slowness is the > > fact that GNU regex supports null characters in strings. For > > examnple, Sed 3.02 compiled with GNU regex is about 2-4 times slower > > on simple jobs than the same Sed compiled with Spencer's regex > > library. > > I think the null characters are a red herring. It's possible; I never had time to look into it far enough to be sure. All I know is that the slow-down happened between two specific versions of GNU regex, and the support for null characters was introduced between those two versions. > The regex.c that came with GDB 4.18, which I think is the one that got > spread around widely, had a bug in its implementation of the POSIX > regcomp/regexec interface, which caused a major performance hit. That > bug has been fixed in GNU libc for a long time. When I replaced > fixincludes' copy of regex.c with a more recent version from glibc, > fixincludes was sped up by a factor of nine. That same bug affects > Sed 3.02 - replace the regex.c it ships with with the one from glibc > 2.2.x and I bet you'll see better performance. > > There's some discussion in these messages: > > http://gcc.gnu.org/ml/gcc-patches/2000-01/msg00764.html > http://gcc.gnu.org/ml/gcc-patches/2000-01/msg00765.html Thanks for the pointers. -- Want to unsubscribe from this list? Check out: http://cygwin.com/ml/#unsubscribe-simple
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |