Mail Archives: djgpp/2014/12/11/15:39:43
This is a port of GNU grep 2.21 to MSDOS/DJGPP.
This is GNU grep, the "fastest grep in the west".
GNU grep is based on a fast lazy-state deterministic matcher (about
twice as fast as stock Unix egrep) hybridized with a Boyer-Moore-Gosper
search for a fixed string that eliminates impossible text from being
considered by the full regexp matcher without necessarily having to
look at every character. The result is typically many times faster
than Unix grep or egrep. (Regular expressions containing backreferencing
will run more slowly, however.)
DJGPP specific changes.
=======================
DJGPP specific changes are those required to implement colorization support
for this port. If grep is called with the command line option --color and
the output is directed to the screen then the default colors will be used
to mark matches, file names and line numbers. If the output does not go to
the screen then colorization is automatically suppressed. Read the docs to
learn how to control the color using the environment variable GREP_COLORS.
As usual all changes are documented in the diffs file that is stored in the
/djgpp directory.
Starting with grep-2.20, multi-byte character support is required to build
the binaries. It was necessary to provide DJGPP specific implementations
of the iswctype and wctype functions to be able to compile the grep sources
at all. Please note that altough multi-byte character support is assumed
to be available to compile grep, neither the DJGPP port nor DJGPP itself
do support multi-byte character encoding like UTF-8 or similiar.
All this means that if future sources require even more multi-byte support
this will be the last grep version ported to DOS using DJGPP. Except if
someone voluteers to implement full featured UTF-8 support for DJGPP.
Please note that to run the test suite produced with autoconf 2.64 and later
you must install mktmp17br2 or later. Because the test suite also tries to
test multibyte patterns but multibyte strings are not fully supported by
DJGPP some tests will be skipped or fail. The yesno test is known to fail,
so please do not report it. Before starting the test suite, please make sure
to unset the GREP_OPTIONS in djgpp.env or the test suite may not work as
expected.
The grep program uses a new method called fts to traverse a file tree. This
code is very posix centric, especially it uses gnulib functions like openat,
openat-proc and fdopendir that try to access directories using file descriptors
with open(). This is only supported by djdev204 but not by djdev203. This
means that if grep is compiled using djdev203, the program will no longer be
able neither to recurse directories nor will be able to follow symlinks.
This is because djdev203 only produces symlinks for programs. In both cases
grep will always terminated with ENOSYS. grep211 was probably the last DJGPP
port of grep compiled with djdev203.
Please also note that there is a function name clash between gnulib's and
grep's gettext wrapper and djgpp's old BORLAND compatibility gettext
function declared in conio.h. This issue has been solved in djdev204.
To solve the problem for djdev203, I provide the patch /djgpp/conio.patch
that will change djdev203's conio.h accordingly to djdev204 conio.h. After
this change the name clash will be solved in the same way as it has been
solved for djdev204. The patch only concerns this name clash.
Please note that starting with grep-2.21 the GREP_OPTIONS environment variable
is no longer supported. But this environment variable is set in djgpp.env
and thus everytime grep is started it issues a warning message either to
STDOUT or to STDERR. This may make scripts fail that analyse the grep output.
If you use grep 2.21 you should remove the GREP_OPTIONS entry from your
djgpp.env. This environment variable has become obsolete and the functionality
intended by its use must be reproduced by a script that calls grep with
these aditional options.
If you compile the sources and you want to run the testsuite you __MUST__
remove the variable or the warning generated by grep will make a lot of
checks fail.
This port has been configured with perl-regexp (pcre) support enabled.
This means that you will have to install pcre libray available as:
ftp://ftp.delorie.com/pub/djgpp/current/v2tk/pcre836b.zip
or
ftp://ftp.delorie.com/pub/djgpp/beta/v2tk/pcre836b.zip
if you decide to compile the the preconfigured sources. If you prefer to
disable the pcre support you will have to reconfigure and recompile the
suorces again, but this time passing the "no-pcre" command line option to
config.bat. The grep221b.zip file contains only the one build with
perl-regexp support enabled.
This port has no wide character/multi byte support at all.
To build this port and run the test suite you will need LFN support.
The source package is configured to be build in the "_build.204" directory.
The port has been configured and compiled with NLS support enabled using the
latest ports of libiconv, libunistring and gettext.
ftp://ftp.delorie.com/pub/djgpp/beta/v2gnu/licv114br2.zip
ftp://ftp.delorie.com/pub/djgpp/beta/v2gnu/gtxt192b.zip
This port provides NLS support for the version compiled with DJGPP 2.04.
It has been configured with NLS support enabled. If you prefer no NLS,
then reconfigure the sources passing the no-nls flag to the config.bat file.
The port has been configured and compiled on WinXP SP3. There is no guarantee
that this may be possible with any other DOS-like OS. Due to the massive use
of long file names it will not be possible to configure and compile without
LFN support.
The DJGPP 2.04 version of the port has been compiled using gcc490 and
bnu224br2. But instead of using the libc.a provided djdev204, a libc
version compiled from the repository code has been used. The repository
code has been patched with the memory patch as provided by Andris Pavenis
in:
http://ap1.pp.fi/djgpp/djdev/djgpp/20140421/use_nmalloc.diff
The goal is to test how well the new memory system and the current libc
code works. The repository code can be downloaded from Martin Stromberg's
site as:
http://www.ludd.luth.se/~ams/djgpp/cvs/djgpp.cvs.tar.gz
All the changes done to the original distribution are documented in the
diffs file and located together with all the files needed to configure
the package (config.bat, config.sed, config.site, etc.) in the /djgpp
directory.
For further information about GNU grep please read the info docs and NEWS file.
This is an verbatim extract of the NEWS file (multi-byte specific features are
not supported by the DJGPP port):
-------------------------------------------------------------------------------
* Noteworthy changes in release 2.21 (2014-11-23) [stable]
** Improvements
Performance has been greatly improved for searching files containing
holes, on platforms where lseek's SEEK_DATA flag works efficiently.
Performance has improved for rejecting data that cannot match even
the first part of a nontrivial pattern.
Performance has improved for very long strings in patterns.
If a file contains data improperly encoded for the current locale,
and this is discovered before any of the file's contents are output,
grep now treats the file as binary.
grep -P no longer reports an error and exits when given invalid UTF-8 data.
Instead, it considers the data to be non-matching.
** Bug fixes
grep no longer mishandles patterns that contain \w or \W in multibyte
locales.
grep would fail to count newlines internally when operating in non-UTF8
multibyte locales, leading it to print potentially many lines that did
not match. E.g., the command, "seq 10 | env LC_ALL=zh_CN src/grep -n .."
would print this:
1:1
2
3
4
5
6
7
8
9
10
implying that the match, "10" was on line 1.
[bug introduced in grep-2.19]
grep -F -x -o no longer prints an extra newline for each match.
[bug introduced in grep-2.19]
grep in a non-UTF8 multibyte locale could mistakenly match in the middle
of a multibyte character when using a '^'-anchored alternate in a pattern,
leading it to print non-matching lines. [bug present since "the beginning"]
grep -F Y no longer fails to match in non-UTF8 multibyte locales like
Shift-JIS, when the input contains a 2-byte character, XY, followed by
the single-byte search pattern, Y. grep would find the first, middle-
of-multibyte matching "Y", and then mistakenly advance an internal
pointer one byte too far, skipping over the target "Y" just after that.
[bug introduced in grep-2.19]
grep -E rejected unmatched ')', instead of treating it like '\)'.
[bug present since "the beginning"]
On NetBSD, grep -r no longer reports "Inappropriate file type or format"
when refusing to follow a symbolic link.
[bug introduced in grep-2.12]
** Changes in behavior
The GREP_OPTIONS environment variable is now obsolescent, and grep
now warns if it is used. Please use an alias or script instead.
In locales with multibyte character encodings other than UTF-8,
grep -P now reports an error and exits instead of misbehaving.
When searching binary data, grep now may treat non-text bytes as
line terminators. This can boost performance significantly.
grep -z no longer automatically treats the byte '\200' as binary data.
-------------------------------------------------------------------------------
The port has been compiled using a libc.a version compiled from current
repository code and patched with the new malloc code. This package is
available at ftp.delorie.com and mirrors as (times tamp 2014-12-09):
grep 2.21 binaries, info and man format documentation:
ftp://ftp.delorie.com/pub/djgpp/beta/v2gnu/grep221b.zip
grep 2.21 dvi, html, pdf and ps format documentation:
ftp://ftp.delorie.com/pub/djgpp/beta/v2gnu/grep221d.zip
grep 2.21 source:
ftp://ftp.delorie.com/pub/djgpp/beta/v2gnu/grep221s.zip
Send grep specific bug reports to <bug-grep AT gnu DOT org>.
Send suggestions and bug reports concerning the DJGPP port to
comp.os.msdos.djgpp or <djgpp AT delorie DOT com>.
If you are not sure if the failure is really a grep failure
or a djgpp specific failure, report it here and *not* to
<bug-grep AT gnu DOT org>.
Enjoy.
Guerrero, Juan Manuel <juan DOT guerrero AT gmx DOT de>
- Raw text -