To: djgpp-announce AT sun DOT soe DOT clarkson DOT edu Subject: New stat() and fstat() for DJGPP Date: Mon, 05 Sep 94 09:51:23 +0200 From: eliz AT is DOT elta DOT co DOT il I've uploaded to omnigate new versions of stat() and fstat() library functions for DJGPP. To download them, anon ftp to omnigate.clarkson.edu and get /pub/msdos/djgpp/pub/djstat01.zip. This implementation should eliminate the #ifdef's usually present around various uses of stat() and fstat() in software ported from Unix, by supplying almost 100% Unix-compatible functions. I'd appreciate if some of you out there find time to test them and drop me a note about the results (see below). The contents of README file follows: Unix-compatible stat() and fstat() for DJGPP ============================================ This package includes new versions of stat() and fstat() functions for DJGPP, and several support functions, some of which have a useful purpose on their own. See the file MANUAL.DOC for explanation of the functions. I've also included the GNU ls program from the latest release 3.9 of GNU Fileutils, compiled and linked with my stat(). This can be used as a vehicle to test this implementation under various environments. Try `ls -li' if you want to see all the mode bits and inode numbers. Also included are two test programs, STAT.EXE and FSTAT.EXE, which are stat() and fstat() compiled with TEST #define'd. These accept a list of files and call stat() or fstat() on each of them, printing most of the fields of struct stat. FSTAT.EXE also prints the info for the 5 handles opened automatically by the start-up code, before calling fstat() on files named on the command-line. Try `fstat < file1 > file2 *.*' to see how fstat() gets correct info from redirected standard streams. All of the above were compiled with DJGPP 1.12maint1. If you want to use it with prior versions, you would have to recompile them. There is a makefile in each subdirectory which should make this very easy. However, note that library time routines in versions prior to 1.12maint1 will sometimes return incorrect time stamp, unless you define a complete POSIX TZ string in the environment The functions were tested and worked OK under MS-DOS 3.3, 4.01, 5.0 and 6.20 (with and without DoubleSpace), with networked drives under XFS 1.76, Tsoft NFS 0.24Beta and Novell Netware 3.22, and with CD-ROM drives under MSCDEX 2.23. Also tested in Windows 3.1 DOS Box under DOS 5.0. Feel free to use them. If you find time to test them in environments other than those listed above, I'd appreciate if you tell me how did they perform. If they ever fail for you in any set-up, *please* let me know so I could fix them. I'm especially interested in DOS clones and emulations such as DR-DOS, Novell DOS, DV/X, OS/2 DOS Box and the like. This implementation relies heavily on undocumented DOS features, some of which are guaranteed to fail under some of these clones, but I've built in fall-back solutions which should still make the functions work at least as good as any other implementation. However, I couldn't test these fully, as I don't have access to every possible environment. The files stat.c and fstat.c, which do most of the actual work, are heavily commented, as they use many obscure or undocumented DOS features. If anyone is interested enough to go through the source and will have comments on it, I will be glad to hear from you. This implementation has an explicit goal to minimize the porting pains of Unix-born programs to DOS by being as Unix-compatible as possible. It has the following features not usually present in DOS-based C libraries: 1. stat() doesn't fail for root directories. 2. Both stat() and fstat() return the starting cluster number of the file as its inode number. If that number is unavailable, it is ``invented'' using the same name hashing technique as coded by Eric Backus in the existing DJGPP library. This inode invention is necessary for files on networked and CD-ROM drives and for empty local files. 3. Both functions set mode bits for all three groups (user, group, other). Only user gets WRITE access to a file (unless it has Read-only, Hidden or System bit set). 4. stat() sets EXECUTE bit for directories, as it is under Unix, and sets their WRITE access bit, unless they have one of Read-only, System or Hidden bit set. 5. Directory size is not reported zero by stat(); the number of used directory entries multiplied by entry size is returned instead for those directories which are reported zero size by DOS (some network redirectors do bring valid size for a directory). 6. fstat() correctly reports access mode bits and device code (st_dev). 7. Both functions report EXECUTE bits based on file's extension and the two-byte magic number present at the beginning of the file. 8. Character devices (such as CON, LPT1, AUX and others) are treated by both functions as if they were on a special drive called "@:" (st_dev = -1). The ``character special'' bit is set for these devices. 9. stat() assumes "d:" (where `d' is a drive letter) to mean "d:.", i.e., the CURRENT directory on that drive, and thus doesn't fail for this argument. The functions in their current version are known to have these (hopefully, minor) deficiencies: 1. fstat() can't get some of the info under Novell Netware version 3.x and below. Specifically, the WRITE access bit, drive letter, and the file's name and extension are unavailable by file handle alone. Until somebody tells me how to get at this info, this problem will cause mode bits returned for these files to be imprecise, and will return different inode number each time you fstat() the same file. Novell 4.x is said to use DOS Network Redirector interface, which doesn't take over DOS so completely as Novell Shell, so the above should not apply there. 2. Real cluster numbers are 16-bit unsigned numbers, but st_ino is declared short. In this environment, invented inode numbers start from 65535 and get decremented for each new number. This could result in an invented number which is identical to a cluster number for a real file. Every disk I've seen has at least 80 cluster numbers near 64K which are never assigned to clusters (i.e., the highest actual cluster number is around 65450), but that might not be enough. Note that this could only be a problem for an empty vs. a non-empty file on a local drive, because otherwise st_dev will be different for such two files. Hopefully, in some future DJGPP version, the st_ino field of struct stat will be widened to 32 bit, and this problem will go away, as inode numbers will then start at 65536 and go up. 3. I don't know how to obtain time fields for root directories, so they are set to the beginning of the DOS times (1-Jan-80). Suggestions, anyone? 4. The technique of inventing inode numbers causes different inode for the same file to be seen by different programs, and could also assign a file a different inode for different runs of the same program. This is unpleasant, but I don't know if this could be repaired easily. 5. fstat() can at its best only see the filename part without the full directory path. For files which are empty, or belong to (non-Novell) networked drives, this means there is a slight possibility that two files with the same name in different directories will get the same inode number. As of this implementation of fstat(), this can only happen if the handle belonging to one of the two files is closed after it was fstat()'ed, and then open() for the second file reuses the same entry in the System File Table as was used by the first file. As long as both files are open, this will NEVER happen. It seems to me improbable that a program will be interested in a file after it was close()'d, but still, this could be a cause of obscure bugs in ported programs. I know only 2 alternatives which avoid this altogether: either invent a new inode on every call to fstat() with the consequences described in (1) above, or trap every call to open() and close() to maintain a table of open files with their full names (this would require to trap every possible way of opening a file, e.g. with direct calls to DPMI server). Both alternatives look less attractive to me, but I would like to hear your comments. Eli Zaretskii