Mail Archives: djgpp/2001/08/04/03:39:14
> From: "Henry Churchyard" <churchh AT crossmyt DOT com>
> Newsgroups: comp.os.msdos.djgpp
> Date: 3 Aug 2001 16:18:41 -0500
> >
> > DJGPP specific changes.
> > =======================
> > - Eli Zaretskii contributed a patch to open the input stream in
> > binary mode on platforms, like DOS/WIN95, that distinguish between
> > text and binary files. This will allow to process files that
> > contain embedded ^Z and lone ^M characters. This patch has already
> > been submitted by him to the sed maintainer, so this feature may
> > become a standard feature in the next official sed release. Thanks
> > to Eli Zaretskii for contributing this.
>
> That's completely the wrong way around; DOS doesn't make any
> distinction whatsoever between binary and text files at the
> file-system level. What happened was that way back when, Unix adopted
> a somewhat non-standard and idiosyncratic definition of "text"
> (i.e. delimited by LF only), while MS-DOS fully followed the relevant
> standards and adopted a standard definition of text (delimited by
> CR-LF). (If you don't believe me, look at all the RFC's governing
> Internet protocols -- if they're text-based, such as SMTP, then they
> specify CR-LF line endings.) So this means that when a C compiler is
> moved over to MS-DOS, it has to have two clunky file-handling modes,
> one of which translates from standard MS-DOS text format to the
> compiler's internal non-standard C/Unix text format
That's true, except that the special meaning of ^Z doesn't come from
the Unix text notion, but rather from CP/M.
In any case, the practical implications of what Juan wrote are still
valid: the previous ports of Sed couldn't handle lone ^M and embedded
^Z characters, while this one can.
- Raw text -