From: tiberius AT braemarinc DOT com (Gary R. Van Sickle) Subject: RE: Why text=binary mounts 8 Jan 1998 23:48:08 -0800 Message-ID: <01BD1C69.BBA294A0.tiberius.cygnus.gnu-win32@braemarinc.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: "gnu-win32 AT cygnus DOT com" This whole UNIX/DOS/text/binary situation drives me nuts. Why can't this problem be solved once and for all by everybody for all time? We are talking about one '\r', for crissake. What's wrong with this solution: 1. If your program is opening a file that you want to get lines of text from (eg a compiler opening a source file), give fopen a "t" 2. If your program is opening a file that you want the 'binary image of' (eg TAR opening its input files), give fopen a "b" 3. Any crusty old program that doesn't conform to 1 & 2 gets fixed, replaced, or canned 4. fread, fgets, fgetc, etc get written so that when used on a "t" mode file, they strip out '\r's before a '\n' and any ctrl-z at the end of the file. 5. fopen is written so that you *must* give it a "b" or "t" or it abort()s. This weeds out the crusty old programs mentioned in 3. (I know it isn't ANSI. What have they done for us lately? :) ) 6. cat to the screen or a printer is binary. Someone writes a filter to convert from text to a format which will look right on the screen or printer and you have to 'cat stdout << filter << textfile.txt'. (I'm obviously not up on my UNIX so please forgive me of this is laughably wrong) With this solution you have two equally valid text file formats, one with \n indicating end-of-line, one with \r\n indicating EOL and ctrl-z possibly indicating EOF. To the program reading lines of text, they both look the same. To the program not reading lines of text, they don't care what the file looks like, and they get the whole 'binary image'. No 'mount mode' is needed. Let me address one sure-to-come-up complaint right now: the notion that it would be too much work to 'fix' all the existing code. How much time and effort is wasted on 'working around' the current situation? Certainly more time than it would take to search-and-replace "w" with "wt", etc. Gary R. Van Sickle (tiberius AT braemarinc DOT com) Electrical Design Engineer Braemar Inc. 11481 Rupp Dr. Burnsville, MN 55337 (612) 890-5135 Ext. 144 Fax: (612) 882-6550 -----Original Message----- From: marcus AT bighorn DOT dr DOT lucent DOT com [SMTP:marcus AT bighorn DOT dr DOT lucent DOT com] Sent: Thursday, January 08, 1998 10:29 AM To: gnu-win32 AT cygnus DOT com Subject: Re: Why text=binary mounts Jeff Fried writes: > Porting code from Unix to the PC should NOT require the same line > termination mode since most Unix code which reads text uses fread/getc > which automatically handle the end-of-line. And from the replies of most > people i would argue that most of us would prefer to work in the native > mode of the operating system in which we are running rather than having to > constantly convert files between the two models simply because we use tools > from both operating systems under NT/95. For examples of this > compatibility look at many of the GNU tools which handle text, the file > handling will work under both operating systems without any change because > they use text mode I/O which is platform independent once all files have > been converted to the form of the native OS. This is true as long as you are considering text files only. The problem comes in when you also want to deal with binary files. On Unix systems, of course, there is no difference in operations on either, so most Unix programs open all files using the same open() or fopen() calls. On systems that differentiate between these files, it is important to add O_BIARY or O_TEXT to the second argument of open(), and "b" for binary files to the second argument of fopen(). This tells the underlying routines whether to apply any translation to the file. If nothing is specified, the OS must choose whether or not to make translations, and that is where the text=/!= binary mounting comes in, as this specifies the default mode. Now, there are some difficulties in this implementation. First, since there is no "t" that can be passed to fopen(), it is impossible to tell if a call to fopen() wants a text mode open, or the default (blame POSIX/ANSI for that, I guess). If you know that all programs have conciously made a choice about things, there would not be any need for a default, so we could assume that the fopen() without a "b" wants a text mode open and mount things as text!=binary. However, if there exist Unix programs that call fopen() without the "b" for binary files (since it isn't needed on Unix and was added to the standard much later than the program may have been written), then these programs won't run correctly without some additional porting effort. The same goes for programs that call open() without the O_BINARY bit set in the second argument when opening binary files. To compound this, there are times when it is extremely difficult to impossible to tell if a file should be opened as text or binary. For instance, should TAR open the files that it is writing to an archive as binary or text files? How can it determine which to use? So, to avoid these issues, many people on this list try to avoid using anything from the Microsoft world (except for NT/95 itself) and use only cygwin32 programs with text=binary so that any file is just like any other file just like in Unix systems. Since their text files are marginally exchangable with other NT/95 users (or other NT/95 applications). So, it seems to me that this gives a slow, incomplete, and buggy (well, it is a Beta release!) emulation of Unix with no advantages over Linux except that their boss has declared that they must run NT (in true pointy-haired boss fashon). Sure, it's fun to play with cygwin32, but to me it doesn't seem reasonable to try to develop it as a Linux replacement. I think that if it is to be truely useful, cygwin32 must encourage interoperating with the native world that it exists in. Part of that is running well in a text!=binary mounted world. Sure, that means that porting programs to Cygwin32 means that you have to install an awareness of binary v.s. text files, and that does mean more work to port the programs, but it also produces more useful programs as well. This discussion keeps coming up, which I believe supports my feeling that it is a major issue with cygwin32. I know that the previous iteration I ended with just agreeing to disagree and I said that I wouldn't say any more in it, but I just wanted to give some support to this side in this iteration and that'll be it (this time around, at least). marcus hall Unfortunately, there is no "t" that can be supplied to fopen() to fully disambiguate the three cases that may occur, so we have the following situation: - For help on using this list (especially unsubscribing), send a message to "gnu-win32-request AT cygnus DOT com" with one line of text: "help". - For help on using this list (especially unsubscribing), send a message to "gnu-win32-request AT cygnus DOT com" with one line of text: "help".