From: jqb AT netcom DOT com (Jim Balter) Subject: Re: ASCII and BINARY files. Why? 29 Jan 1997 15:49:06 -0800 Approved: cygnus DOT gnu-win32 AT cygnus DOT com Distribution: cygnus Message-ID: <32EFD915.673E.cygnus.gnu-win32@netcom.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 3.01Gold (WinNT; I) Original-To: Fran Litterio Original-CC: "'gnu-win32 AT cygnus DOT com'" Original-Sender: owner-gnu-win32 AT cygnus DOT com Fran Litterio wrote: > > Fergus Henderson wrote: > > >Jim Balter, you wrote: > > >> and/or the library > >> could allow a special name form such as dos:filename that causes it to > >> open the file in "text" mode. > > > >[...] (b) that still doesn't solve the problem of using Windows tools on > >gnu-win32 text files. It would let you create "text" files using GNU apps if you absolutely needed to. > What problem is that? Windows tools work fine on gnu-win32 text files > (i.e., text files without any ^M's) -- at least every Windows tool I > have tried. David Korn's UWIN project has decided to open all files in > binary mode all the time precisely because so few (if any) Windows apps > care whether a text file contains ^M characters or not. Well, there's always Notepad, and the WinZip internal file viewer, and probably plenty of others. But it's tolerable. The current situation is that, even fixing things like gzip, the rest of the tools are useless on help files, executables, databases, or any other sort of file that isn't a "text" file. The whole concept of text/binary for Windows is ill-conceived; it can only be made to work well on systems where files have meta-data that determines the type, such as VMS (I know something about that because I wrote Interactive System Corporation's IS/WB, the equivalent of GNU-win32 for VMS, back in the early 80's). I can guarantee you that the text/binary split will *never* stop being a major headache. The fact that cat throws away characters from files and stops dead at ^Z makes any hope of building robust systems on top of this thing hopeless. One solution would be to do away with the text/binary split and fix any program that cannot handle CR's within lines. I'm not talking about throwing them away in filters, as with the current situation, but rather make sure that programs that *parse* lines can handle arbitrary whitespace. This would all be POSIX compatible and viewable as bug fixes, and thus quite possibly mergeable back into the GNU sources. There might be a few exceptions where the lines are defined as exactly the bytes up to a NL, but these are usually configuration files that aren't likely to show up on Windows systems. This would take care of the input side. On the output side, a filter that adds CR's before NL's allows creation of files that need to be in CRNL format. To really complete the picture there could be mechanisms such as directory tables (like the mount -b flag, but in reverse), filename tables, .ext tables, or special markers in filenames such as dos:filename or filename+cr (probably fewer problems with the latter), that would indicate that those files should have CR's automatically added before any NL's (but this would occur *only* on output). Writes to pipes, of course, not being specially marked, would be raw ("binary"), which would mean they would work with any data, quite unlike now. Although this takes source modifications to get some programs to work with DOS files, most already do, and all the utilities of course already work with "binary" files and streams, including the ones they themselves produce. It doesn't take a whole lot of figuring to see that there are an order of magnitude fewer problems with this approach than the current text/binary monstrosity, which is why David Korn chose this approach, even without any sophisticated mechanisms for output filtering. At the very least, it would be nice if there were a configuration option to set up on a mount -b filesystem, and if the system were tested in such an environment before being distributed. This would allow people to more easily locate any problems with interoperability with DOS/Windows files, and would make a transition away from "text" mode at least conceivable. In a month or two my major contract will be winding down, and perhaps I can even contribute some coding time. -- - For help on using this list, send a message to "gnu-win32-request AT cygnus DOT com" with one line of text: "help".