delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/1998/01/20/00:43:19

From: srn AT techno DOT com (Steven R. Newcomb)
Subject: Re: Why text=binary mounts
20 Jan 1998 00:43:19 -0800 :
Message-ID: <199801180326.WAA05374.cygnus.gnu-win32@bruno.techno.com>
References: <3 DOT 0 DOT 3 DOT 32 DOT 19980115090908 DOT 0091b910 AT mail>
To: richard AT smartmedia DOT co DOT uk
Cc: gnu-win32 AT cygnus DOT com

> >If you want to read a line of 
> >text, it seems to me that the most logical thing to do would be to use a 
> >library which gave you access to functions such as fscanf() etc. which have 
> >no meaning for generic (binary) files.  This library then would be the 
> >place to do things like making all text files look the same to the 
> >programmer whether they're DOS/UNIX/Mac/whatever, in the same way that a 
> >PCX library might 'gloss over' the differences between the different PCX 
> >versions.
> 
> Good point. It's also important to remember that not all text is ASCII or
> ANSI, there's EBDIC (?) and a whole bunch of others too. Maybe a decent
> text library could even handle unicode files or something (I know nothing
> about unicode so dont flame me please) as well. Personally, when I open a
> file, I expect to get what's there. That *should* be the default. A file is
> just a bunch of bytes and that's the way it should be treated. If you want
> some kind of filter or interpretation, get a library.
> 
> A well written text processing program should recognise any combination of
> <cr> and <lf> as an end-of line marker and should write either the
> operating system default (But the OS should have no concept of "text"
> files) or ansi standard (if there is such a beast) or maybe even a format
> selected by the user.

I like the way both of you think.  Sounds to me like you should both
take a look at SGML, ISO 8879:1986.  And particularly at the SGML
Extended Facilities found in ISO/IEC 10744:1997 (see
http://www.hytime.org for pointers including the standard itself).
You will be surprised and pleased, I think, to discover that there is
such a beast, and, marvelous to say, it's already internationally
standardized.  Of course, the paradigm assumes that there are
documents (SGML documents, of course) that declare the notations of
information resources, and that optionally declare the libraries
and/or applications that understand notations of resources.  There are
also storage manager declarations that handle such things as
encryption, sealing, alternative character sets, compression,
containerizations such as tar, multimedia interleaving, etc.  Check it
out.  Some of us, at least, think it's the future of content
management.


-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn AT techno DOT com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA
-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request AT cygnus DOT com" with one line of text: "help".

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019