delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/1997/01/31/21:25:25

From: franl AT world DOT std DOT com (Francis Litterio)
Subject: Re: ASCII and BINARY files. Why?
31 Jan 1997 21:25:25 -0800 :
Approved: cygnus DOT gnu-win32 AT cygnus DOT com
Distribution: cygnus
Message-ID: <E4w69q.GpC.cygnus.gnu-win32@world.std.com>
References: <199701310237 DOT TAA19218 AT nirvanah DOT corp DOT es DOT com>
X-Newsreader: Forte Free Agent 1.0.82
Original-To: gnu-win32 AT cygnus DOT com
Original-Sender: owner-gnu-win32 AT cygnus DOT com

Barry Fishman <bfishman AT nirvanah DOT corp DOT es DOT com> wrote:

> Now back to the ASCII/BINARY discussion.  I think we need to follow
> the principle of least astonishment.  When one opens a file and
> sees ^M 's at the end of each line, you can tell right away what is
> going on.

Humans can see that, but the gnu-win32 DLL software cannot.  Does it
strip the ^Ms or not?  It needs to be told somehow.  This is the crux
of the problem.  Probabilistic text-detection (like Perl's -t test)
are are completely unacceptable in an OS-emulation package -- the
least astonishment rule dictates that much.

What we are really wrestling with is how do we tell the DLL whether or
not to strip CRs.  Many of us think we just tell it not to strip them
ever.  I liked Jeff Epler's suggestion in message ID
<Mutt DOT 19970130231214 DOT jepler AT craie DOT inetnebr DOT com> of how the user can
configure the behavior with glob lists and file prefixes/suffixes (if
we were to agree on that, we could move on to arguing about what the
default configuration should be! :-)

> What is difficult is having to spend hours patching each application to
> to get around unexpected problems with seek addresses and files that
> don't match their expected sizes.

Exactly.  Why make hundreds of changes to separate applications if you
can avoid all that hassle by reversing the simple decision to default
to text-mode?

> I think ANSI created the problem by having the binary/text decision made
> by the application, and not a property of the file.

ANSI C came after both the UNIX and DOS filesystems.  It had no choice
but to accept the underlying behavior of the filesystems (since it was
a language standard not an OS standard).  And if POSIX had tried to
introduce typed files, they'd still be arguing over it.

> Do NT file systems record this information?

No, but they support a file-forking feature called multiple data
streams in which a single file can have multiple separately-named
parts.  An extra data stream could hold meta-information about the
file (such as whether the file contains text or binary data).  But
what about data flowing through pipes?  There's no way to type that
data since it never hits the filesystem.  What if UNIX-style named
pipes are one day supported by gnu-win32 -- does the type of the named
pipe file affect the type of the data flowing through the pipe?  My
brain itches just thinking of the can of worms this opens.  Let this
idea die now ...
--
Francis Litterio                     PGP Key Fingerprint:
franl AT world DOT std DOT com                  02 37 DF 6C 66 43 CD 2C
http://world.std.com/~franl/         10 C8 B5 8B 57 34 F3 21

-
For help on using this list, send a message to
"gnu-win32-request AT cygnus DOT com" with one line of text: "help".

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019