Next: DJGPP.ENV, Up: Hidden Features [Contents][Index]
This section describes some advanced features provided by DJGPP. Most of these features are built into the C library, but some are provided by the basic development utilities which are part of the DJGPP development environment. Since DJGPP is a Posix-compliant environment, many of these features are motivated by Unix compatibility.
The DJGPP header files and library functions are highly compatible
with other popular environments. In addition to full ANSI and Posix
compliance, DJGPP also offers compatibility to many PC and Unix
libraries. For example, DJGPP provides library functions that are
usually absent from other DOS- and Windows-based libraries, like
popen
, glob
, statfs
, getmntent
,
getpwnam
, select
, and ftw
. Other functions,
although they exist in DOS/Windows libraries, are incompatible with
Posix in subtle ways. For example, the ANSI-standard function
rename
typically fails in DOS/Windows implementations if the
target file already exists (because the underlying OS call fails).
DJGPP makes a point of sticking to Posix or Unix behavior in such
cases, even if it means more processing (like removing the target file
in the case of rename
).
A case in point is library functions stat
and fstat
. Unix
programs make extensive use of the inode number and the mode bits
returned by these functions. For example, GNU diff
examines the
inode numbers of the files it is about to compare, and if they are
equal, exits immediately on the assumption that both file names point to
the same file. However, DOS and Windows don’t support inodes, and most
other DOS/Windows implementations return zero in the st_inode
member of struct stat
, which of course breaks diff
. Also,
the mode bits returned by fstat
are usually incorrect. In
contrast, the DJGPP implementation of these functions goes out of
its way to provide compatible implementations for these functions, and
in particular returns meaningful inode numbers, even though it takes
quite a lot of code (for example, stat
code compiled totals about
17KB, together with other library functions it calls).
Such high compatibility makes porting programs very easy.
When DOS invokes programs, it limits the length of the command line to 126 characters (excluding the program’s name). This is a ridiculously small limit; it doesn’t even allow to compile GCC, since many commands in GCC Makefiles are much longer.
Therefore, DJGPP provides a mechanism to pass long command lines to
child programs. The actual command is stored in the transfer buffer,
and a pointer to that buffer is passed to the child program instead of
the command line itself. The startup code of the child program then
retrieves the actual command-line arguments and puts them into the
argv[]
array passed to main
.
DJGPP also supports the so-called response file method of passing long command lines, whereby the command line is stored on a disk file, and the name of that file is passed as ‘@response-file’. For example:
ar cq libmylib.a @files-list
All Unix programs assume that any file-name wildcards on their command
line were already expanded by the shell, to yield normal file names.
But DOS shells don’t provide this functionality, so the wildcards would
wind up verbatim in the argv[]
array. To avoid the need to have
special code in every ported program that expands the wildcards, the
DJGPP startup code expands the wildcards automatically. The
expansion follows the Unix conventions, so ‘*’ expands to all file
names, unlike the DOS conventions where it excludes file names with
extensions.
The globbing code supports Unix-style quoting with the ‘'’ and ‘"’ characters (most other DOS/Windows compilers and shells only support ‘"’). Escaping special characters with ‘\’ is limited to the quote characters themselves, since ‘\’ serves as a directory separator in DOS/Windows file names.
DJGPP also provides a special extension: the ‘...’ wildcard expands recursively to all the subdirectories. Thus, the following command would search all files in all the subdirectories, recursively:
grep foo .../*
(This was hard to achieve even on Unix, until the recent release of the GNU Grep package introduced the ‘--recursive’ option.)
system
function.
Traditionally, the system
library function calls the shell to
process its argument. However, stock DOS shell COMMAND.COM is
too dumb to be useful in many cases. For example, it doesn’t support
long command lines, even though DJGPP programs do; it doesn’t
understand forward slashes in file names; and it doesn’t return the exit
code of the child program to the parent.
Therefore, the DJGPP version of system
usually doesn’t call
COMMAND.COM at all. Instead, it internally emulates its
functionality, including redirection and pipes, and invokes the programs
directly. This allows to provide the following important features:
This is described under “Long command lines” above, but here it means that shell commands can have arbitrary length, even though the shell itself doesn’t support that!
File names which are targets of redirection can be given in the Unix /foo/bar style. Unix devices, such as /dev/null, are also supported (see “Transparent conversion of special file names”, below).
The emulation code supports the ‘foo ; bar’ feature of several commands separated by a semi-colon.
The emulation of the shell command ‘cd’ allows Unix-style forward slashes in its argument, and also changes the drive if the argument includes the drive letter.
If the environment variable SHELL
points to a name like sh
or bash, system
invokes the shell to do everything, since
the internal shell emulation is not sophisticated enough to cover Unix
shell functionality.
Shell scripts can be invoked even if the SHELL
environment
variable doesn’t point to a Unix-style shell, provided that the
interpreter whose name appears on the first script line after the
‘#!’ signature can be found somewhere along the PATH
.
COMMAND.COM is only invoked by system
to run batch files
or commands internal to the shell. However, system
always looks
for external programs first, so if you have e.g. a port of the GNU
echo
program installed, system
will call it even though
COMMAND.COM has an internal (and very much inferior) command by
that name.
These features come in especially handy in the DJGPP port of GNU
make
. Where the original Unix code of make
invokes the
shell, the DJGPP port simply calls system
to execute the
commands in rules, and automatically gets support for long command lines
and Unix-style shells required to run many Makefiles of Unix
origin.
The above extended functionality also means that whenever a Unix program
calls system
, in most cases the same call will work without any
changes when compiled with DJGPP. The result is not only ease of
porting, but also less probability to leave subtle bugs in the ported
program due to an overlooked fragment which assumes a Unix shell.
All DJGPP library functions pass file names to DOS via a single
low-level function. This allows to remap some special file names to
their DOS equivalents. For example, Unix-standard device names
/dev/null and /dev/tty are converted to their DOS
equivalents NUL and CON, respectively. File names which
begin with /dev/x/, where x is a drive letter, are
converted to the DOS x:/ form; this is required for running
some Unix shell scripts which take apart the PATH
variable where
colons separate directories.
In addition, file names which begin with /dev/env/ are expanded
using the environment variables. For example, /dev/env/DJDIR
expands into the full path name of the top DJGPP installation directory,
since the environment variable DJDIR
has that directory as its
value.
This feature is built into the low-level file-oriented library
functions. It allows the application to install a handler for certain
filesystem calls, like open
, read
, fstat
,
dup
, close
, etc. If installed, such a handler is called
just before the appropriate primitive is invoked to pass the call to
DOS. If the handler returns a non-zero value, it is assumed to have
handled the call, and the usual primitive call is bypassed. Otherwise,
the library proceeds with calling DOS as usual.
This facility provides an easy way of handling special files and devices which DOS and Windows don’t support directly. For example, a program can install a handler for special file names like /dev/ptyp0 and emulate these non-existent devices via an async communications library.
Another way of putting filesystem extensions to a good use is when
there’s a need to emulate functionality that DOS file I/O doesn’t
support, even though the associated devices do exist. For example,
suppose you need to port code which sends special commands to the
terminal device via termcap
functions. DOS supports a terminal
device, but doesn’t support termcap
. However, it is possible to
achieve the same effects if direct screen writes are used instead of
file I/O. By installing a filesystem extension handler for the standard
output handle, you could redirect all terminal I/O to direct screen
writes and implement all the necessary termcap
functionality,
without any changes to the program’s source code. This is how the
DJGPP port of GNU ls
supports the ‘--color’ option
without forcing users to install ANSI.SYS, which is a special
terminal driver that interprets ANSI escape sequences (and also has
several nasty side-effects).
DOS system calls are limited to file names in the so-called 8+3 format: maximum 8 characters for the basename and maximum 3 characters for the extension. Therefore, it is impossible to access the long file names, offered by Windows 9X and Windows NT, via the DOS system calls. However, Windows 9X provides a special API (a bunch of special functions of software interrupt 21h) that allows DOS programs to access long file names. This API is widely known as the LFN API, where LFN is an acronym for Long File Names. For each file-oriented DOS system call, the LFN API includes a replacement that supports long file names. For example, there are functions to open files, list the files in a directory, create a directory, etc. using long names. The LFN API also adds several functions to access extended functionality supported by the Windows filesystems. For example, it is possible to get and set 3 times for each file, like on Unix, instead of only one time supported by DOS.
The DJGPP library features transparent and automatic support for long file names on Windows 9X1 The DJGPP startup code queries the system for the availability of the LFN API, and if it’s available, all low-level file-oriented primitives are automatically switched to using the special LFN-aware functions. This run-time detection of the LFN support means that the same executable will run on DOS and on Windows, and will automatically support long file names when it runs on Windows 9X.
DOS doesn’t support hard and symbolic links. However, DJGPP
emulates them to some extent. The link
library function
simulates hard links by copying. Symbolic links are fully emulated by
most file handling functions in the library. Also, symlink support API
that you would expect find only on UNIX (such as readlink
) is
present. Function symlink
creates files with special size and
format which are recognized by other library functions. Because DOS
itself and most DOS applications weren’t designed with symlinks in mind,
there is a subset of file handling API which intentionally does not support
symlinks. This includes functions with _dos_
prefix, as well
as other functions with DOS origin, such as findfirst
.
Emacs is special because when it dumps itself during the build process, static and global variables are frozen in the dumped image with the last value they had at the time the program was dumped. DJGPP has a special facility in the library through which library functions can detect that the program was dumped and restarted. All library functions that need static variables, use this facility to reinitialize them. This allows Emacs to be built with DJGPP without the need to analyze whether each library function called by Emacs is dump-safe.
In addition to relying on GNU development tool-chain, DJGPP introduces several utilities written specifically for the project. These utilities are meant to assist the developer in solving specific tasks common for the DJGPP environment. Some of these utilities are listed below:
djtar
is a program that unpacks archives (but cannot create
them). It was originally written to unpack files created by tar
,
because DOS and Windows lack standard programs for that. Since the
original release, djtar
functionality was significantly extended,
and now it can unpack .tar.gz and .zip files as well. It
also can unpack archives from floppy disks written as raw
/dev/rfd0a
devices on Unix systems, and it uncompresses and
untars .tar.gz
files on the fly, by feeding the untar code with
output of the unzip code. The latter feature is very important when
unpacking large distributions, such as emacs-XX.YY.tar.gz,
because pipes are implemented as temporary disk files on DOS/Windows,
and so on-the-fly decompression avoids creating huge temporary disk
files.
The ability to unzip .zip archives makes djtar
the only
free program which does that, since it turns out that InfoZip’s
UnZip
license does not comply with FSF’s definition of free
software (according to Richard Stallman).
In addition, djtar
offers several features designed to prevent
problems due to DOS/Windows file-name restrictions, see “DOS file names
handling”, below.
These two programs come in handy when you need to carry a large file
(usually, a compressed archive of a large distribution) on floppies.
djsplit
splits a file into smaller chunks whose size is
user-defined, and djmerge
splices the chunks back together.
These programs are close cousins of dos2unix
and unix2dos
,
respectively, but they have several clever tricks up their sleeves.
First, they take file names from the command-line arguments and rewrite
each file, instead of reading stdin
and writing stdout
;
thus, they can convert many files in a single run. And second, they
preserve the time stamps of the converted files, to keep utilities like
make
happy. With these programs, you can convert the entire
directory tree of C source files to the DOS CR-LF format with a single
command:
utod .../*.[ch]
This uses the DJGPP wildcard expansion and the special ... wildcard mentioned above.
This is a replacement for the well-known move-if-changed shell
script. It is very handy in Makefiles which should run on
systems that don’t have Bash installed. Since it understands Unix-style
forward slashes (like all DJGPP programs do), it is also widely
used in Makefiles for copying files, instead of the shell’s
internal COPY
command, since make
doesn’t live well with
backslashes in file names.
As its name implies, redir
redirects standard handles. It was
originally written to allow redirection of stderr
, which stock
DOS shell COMMAND.COM cannot do. You need this redirection,
e.g., when GCC spits out a long list of error messages which scroll off
the screen. redir
can also append redirected handled (a-la
‘>>’) and redirect stderr
to the same place as stdout
or vice versa, like what ‘>&’ does in Unix shells.
In addition, redir
reports the exit status of the program it
runs, and print the elapsed time used by the child. These features are
provided because, unlike on Unix, there are no standard utilities to do
that.
DJGPP debugging support doesn’t include Unix-style core files which allow post-mortem debugging of a crashed program. To compensate for this deficiency, when a program crashes, a special library module prints the values stored in the CPU registers and the traceback of the function calls that led to the crash, as stored in the call frames pushed onto the stack.
However, the stack traceback, as printed, is hard to interpret, because
it only includes numeric addresses of the functions. The symify
program solves this problem. It reads the traceback directly from the
video memory, and uses the debug info recorded in the program’s
executable file to convert the addresses into file names and line
numbers of the source files. It then adds the file names and line
numbers information near the corresponding addresses, thus making the
traceback easy to comprehend.
Besides the library functions and DJGPP-specific programs, a lot of special code went into the utilities ported to DJGPP, so that these utilities could work together smoothly and have the effect a user would expect. Some of these extensions are listed below:
PATH
format.
Unix uses ‘:’ to separate directory names in the value of
environment variables such as PATH
. Many shell scripts rely on
this feature to look for programs along the PATH
. For example,
the GNU-standard configure scripts do that to find gcc
,
ranlib
and other programs, as part of the auto-configuration
process.
However, DOS and Windows use ‘;’ to separate directories in
PATH
(because absolute file names include a drive letter, like in
d:/foo/bar). This breaks shell scripts which search along the
PATH
.
To allow these scripts to run without changes, the DJGPP port of
Bash introduces a special variable PATH_SEPARATOR
. If this
variable is set to ‘:’, Bash converts the value of PATH
to
pseudo-Unix form. For example, if the original value of PATH
is
like this:
PATH=c:\djgpp\bin;d:\gnu\emacs\bin
then setting ‘PATH_SEPARATOR=:’ converts it to this:
PATH=/dev/c/djgpp/bin:/dev/d/gnu/emacs/bin
This lets Unix shell scripts run unaltered. However, to prevent the
external commands from breaking (because they don’t know anything about
PATH_SEPARATOR
), Bash converts the value of PATH
back to
its usual DOS style in the environment it passes to child programs.
The DJGPP library supports the special /dev/x/ file names by
converting them to the usual DOS x:/ format, before it issues DOS
calls, so all DJGPP-compiled utilities can be safely run by a
script when PATH_SEPARATOR
is set to ‘:’.
PATH
.
make
to be happy
when Unix Makefile is in use (since the target names are usually
extension-less on Unix), while the second can be run from the DOS
command prompt, since stock DOS shell refuses to run a program without
one of the executable extensions (.exe, .com or
.bat) it knows about. Both of these features are intended for
using Unix Makefiles without changes.
PATH
as well, so that users won’t need to
have a /bin directory.
lpr
, write to the local
printer device instead, if lpr
could not be located. Emacs and
dvips
are two examples of programs that offer this feature.
tar
and cpio
programs, and the djtar
utility supplied with the DJGPP development kit are examples of
such programs. They replace characters which aren’t allowed in file
names, like ‘+’ on MS-DOS or ‘"’ on MS-Windows, and rename
files whose names are reserved on DOS/Windows by character devices (and
therefore writing to them could have unexpected results).
Another potential problems in unpacking file archives is that several
different file names can map to the same name after truncation to the
DOS 8+3 limits or as result of the automatic renaming I just described.
For this reason, djtar
refuses to overwrite existing files, and
requires the user to type in another name under which the file will be
extracted. If the user presses RET, the file is skipped.
This interactive, one-by-one renaming might be tedious and error-prone,
when there’s a lot of files to rename. A case in point is the test
suite in the GNU Textutils distribution with a lot of names like
n+4b2l10f-0FF, njml17f-lmlmlo, etc. For these cases,
djtar
has a command-line option which can be used to submit a
file with a mapping between original and DOS names; djtar
will
automatically rename every file mentioned there and will leave all other
file names intact. An example of putting this feature to use can be
seen in the latest versions of Textutils (look for the file
djgpp/fnchange.lst and the instructions to use it in
djgpp/README).
The features mentioned above are mostly small niceties. But can you imagine the amount of hacking needed to get Unix Makefiles and shell scripts to work on DOS and Windows machines, if these tidbits didn’t exist?
Windows NT does not include this API, therefore DJGPP programs cannot access long file names on NT systems. However, a free LFN driver for NT is available from the v2misc/ directory of the DJGPP archive on Simtel.NET.
Next: DJGPP.ENV, Up: Hidden Features [Contents][Index]