delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/1997/01/29/15:49:06

From: jqb AT netcom DOT com (Jim Balter)
Subject: Re: ASCII and BINARY files. Why?
29 Jan 1997 15:49:06 -0800 :
Approved: cygnus DOT gnu-win32 AT cygnus DOT com
Distribution: cygnus
Message-ID: <32EFD915.673E.cygnus.gnu-win32@netcom.com>
References: <c=US%a=_%p=Amulet._Inc.%l=JAGUAR-970129201251Z-7139 AT jaguar DOT amulet DOT com>
Mime-Version: 1.0
X-Mailer: Mozilla 3.01Gold (WinNT; I)
Original-To: Fran Litterio <franl AT amulet DOT com>
Original-CC: "'gnu-win32 AT cygnus DOT com'" <gnu-win32 AT cygnus DOT com>
Original-Sender: owner-gnu-win32 AT cygnus DOT com

Fran Litterio wrote:
> 
> Fergus Henderson wrote:
> 
> >Jim Balter, you wrote:
> 
> >> and/or the library
> >> could allow a special name form such as dos:filename that causes it to
> >> open the file in "text" mode.
> >
> >[...] (b) that still doesn't solve the problem of using Windows tools on
> >gnu-win32 text files.

It would let you create "text" files using GNU apps if you absolutely
needed to.

> What problem is that?  Windows tools work fine on gnu-win32 text files
> (i.e., text files without any ^M's) -- at least every Windows tool I
> have tried.  David Korn's UWIN project has decided to open all files in
> binary mode all the time precisely because so few (if any) Windows apps
> care whether a text file contains ^M characters or not.

Well, there's always Notepad, and the WinZip internal file viewer,
and probably plenty of others.  But it's tolerable.

The current situation is that, even fixing things like gzip,
the rest of the tools are useless on help files, executables,
databases, or any other sort of file that isn't a "text" file.  The
whole concept of text/binary for Windows is ill-conceived; it can only
be made to work well on systems where files have meta-data that
determines the type, such as VMS (I know something about that because I
wrote Interactive System Corporation's IS/WB, the equivalent of
GNU-win32 for VMS, back in the early 80's).
 
I can guarantee you that the text/binary split will *never* stop
being a major headache.  The fact that cat throws away characters
from files and stops dead at ^Z makes any hope of building robust
systems on top of this thing hopeless.

One solution would be to do away with the text/binary
split and fix any program that cannot handle CR's within
lines.  I'm not talking about throwing them away in filters,
as with the current situation, but rather make sure that programs
that *parse* lines can handle arbitrary whitespace.  This would all
be POSIX compatible and viewable as bug fixes, and thus quite possibly
mergeable back into the GNU sources.  There might be a few exceptions
where the lines are defined as exactly the bytes up to a NL,
but these are usually configuration files that aren't likely to show
up on Windows systems.  This would take care of the input side.
On the output side, a filter that adds CR's before NL's allows creation
of files that need to be in CRNL format.  To really complete
the picture there could be mechanisms such as directory tables
(like the mount -b flag, but in reverse), filename tables, .ext tables,
or special markers in filenames such as dos:filename or filename+cr
(probably fewer problems with the latter), that would indicate that
those files should have CR's automatically added before any NL's
(but this would occur *only* on output).  Writes to pipes, of course,
not being specially marked, would be raw ("binary"), which would mean
they would work with any data, quite unlike now.

Although this takes source modifications to get some programs to
work with DOS files, most already do, and all the utilities
of course already work with "binary" files and streams, including
the ones they themselves produce.

It doesn't take a whole lot of figuring to see that there are an order
of magnitude fewer problems with this approach than the current
text/binary monstrosity, which is why David Korn chose this approach,
even without any sophisticated mechanisms for output filtering.

At the very least, it would be nice if there were a configuration
option to set up on a mount -b filesystem, and if the system were
tested in such an environment before being distributed.  This would
allow people to more easily locate any problems with interoperability
with DOS/Windows files, and would make a transition away from "text"
mode at least conceivable.  In a month or two my major contract
will be winding down, and perhaps I can even contribute some coding
time.

--
<J Q B>
-
For help on using this list, send a message to
"gnu-win32-request AT cygnus DOT com" with one line of text: "help".

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019