delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2001/08/04/03:39:14

Date: Sat, 04 Aug 2001 10:37:37 +0300
From: "Eli Zaretskii" <eliz AT is DOT elta DOT co DOT il>
Sender: halo1 AT zahav DOT net DOT il
To: djgpp AT delorie DOT com
Message-Id: <7458-Sat04Aug2001103737+0300-eliz@is.elta.co.il>
X-Mailer: Emacs 20.6 (via feedmail 8.3.emacs20_6 I) and Blat ver 1.8.9
In-reply-to: <9kf4fh$tn8@moe.cc.utexas.edu> (churchh@crossmyt.com)
Subject: Re: ANNOUNCE: DJGPP port of GNU Sed 3.02.80 uploaded
References: <200107241624 DOT MAA22074 AT delorie DOT com> <9kf4fh$tn8 AT moe DOT cc DOT utexas DOT edu>
Reply-To: djgpp AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

> From: "Henry Churchyard" <churchh AT crossmyt DOT com>
> Newsgroups: comp.os.msdos.djgpp
> Date: 3 Aug 2001 16:18:41 -0500
> >
> > DJGPP specific changes.
> > =======================
> > - Eli Zaretskii contributed a patch to open the input stream in
> >   binary mode on platforms, like DOS/WIN95, that distinguish between
> >   text and binary files.  This will allow to process files that
> >   contain embedded ^Z and lone ^M characters.  This patch has already
> >   been submitted by him to the sed maintainer, so this feature may
> >   become a standard feature in the next official sed release.  Thanks
> >   to Eli Zaretskii for contributing this.
> 
> That's completely the wrong way around; DOS doesn't make any
> distinction whatsoever between binary and text files at the
> file-system level.  What happened was that way back when, Unix adopted
> a somewhat non-standard and idiosyncratic definition of "text"
> (i.e. delimited by LF only), while MS-DOS fully followed the relevant
> standards and adopted a standard definition of text (delimited by
> CR-LF).  (If you don't believe me, look at all the RFC's governing
> Internet protocols -- if they're text-based, such as SMTP, then they
> specify CR-LF line endings.)  So this means that when a C compiler is
> moved over to MS-DOS, it has to have two clunky file-handling modes,
> one of which translates from standard MS-DOS text format to the
> compiler's internal non-standard C/Unix text format

That's true, except that the special meaning of ^Z doesn't come from
the Unix text notion, but rather from CP/M.

In any case, the practical implications of what Juan wrote are still
valid: the previous ports of Sed couldn't handle lone ^M and embedded
^Z characters, while this one can.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019