From: jqb AT netcom DOT com (Jim Balter) Subject: Re: Text file format (off-topic, was Re: using cat on binary files ( 30 Oct 1996 01:13:32 -0800 Sender: daemon AT cygnus DOT com Approved: cygnus DOT gnu-win32 AT cygnus DOT com Distribution: cygnus Message-ID: <199610300455.UAA09829.cygnus.gnu-win32@netcom23.netcom.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Original-To: kerr AT wizard DOT net (Shane Kerr) Original-Cc: jqb AT netcom DOT com, gnu-win32 AT cygnus DOT com In-Reply-To: <199610300258.VAA04111@wizard.wizard.net> from "Shane Kerr" at Oct 29, 96 09:58:08 pm X-Mailer: ELM [version 2.4 PL23] Content-Length: 2975 Original-Sender: owner-gnu-win32 AT cygnus DOT com Shane Kerr wrote: > > I know this is probably pretty far off of the GNU-WIN32 topic, but... > > > Maybe we can file a class action suit for a few billion against the > > turkey who unleashed on the world a system with such fundamentally > > bad design decisions as a two-character EOL indicator and an in-band > > EOF indicator. > > You have to understand where the Win32 file system came from: MS-DOS. > Then you have to understand where the MS-DOS file system came from: > CP/M. In CP/M, there was no system information describing the size > of a file - only the number of blocks that it used. So an in-band > EOF indicator was needed. An EOF indicator was needed because the file system didn't maintain a byte count. Was not maintaining a byte count necessary? No; it was a bad design decision to keep only a block count and not a byte count, despite hundreds of file systems in existence at the time that kept byte counts. Many really bad systems that kept only sector or word counts had been distributed by individual computer vendors, and those systems had been recognized as being mistakes and abandoned in favor of better technology. The CP/M file system was designed by amateurs without an appreciation for the state of the art. > As for the two-character EOL, it _does_ more accurately represent > what's happening when you dump a text file to a line printer or a > terminal. At the end of each line, you want to go down a line, for > which you use a newline character, then you want to move the print > head back to the left-hand side, for which you use a carriage return > (return the carriage to the left). So files contain these two characters because someone was too lazy to add a couple of lines of code to drive the printer or terminal properly. What if the file system developer's terminal had been a stroke vector device? We might have ended up with characters being stored as strokes, by this reasoning. > When you think of it like this, Thinking like this is failing to think abstractly. It's amateurish, and it's bad design. > it's UNIX (newline only) and Macs (carriage return only) that have a > bogus text file format, not CP/M, MS-DOS, and Win32. Since all that is *needed* is a single line separator character, a single line separator character is not bogus. Since NL is the ANSI "newline" character, it makes pretty good sense, although RS (record separator) might have made more sense. "carriage return" is harder to justify. > The true evil is the different standards, not that any particular > standard is really that bad. The CRNL eol creates problems because a lone CR or NL is not defined, and because it makes character-at-a-time processing unnecessarily difficult. > Yes, it sucks that we have to deal with > it now. But, as my boss says, it's easy to criticize, and hard to > create. I've been designing and creating systems software for over 30 years. Critical analysis makes good design. -- - For help on using this list, send a message to "gnu-win32-request AT cygnus DOT com" with one line of text: "help".