Date: Sat, 04 Aug 2001 10:37:37 +0300 From: "Eli Zaretskii" Sender: halo1 AT zahav DOT net DOT il To: djgpp AT delorie DOT com Message-Id: <7458-Sat04Aug2001103737+0300-eliz@is.elta.co.il> X-Mailer: Emacs 20.6 (via feedmail 8.3.emacs20_6 I) and Blat ver 1.8.9 In-reply-to: <9kf4fh$tn8@moe.cc.utexas.edu> (churchh@crossmyt.com) Subject: Re: ANNOUNCE: DJGPP port of GNU Sed 3.02.80 uploaded References: <200107241624 DOT MAA22074 AT delorie DOT com> <9kf4fh$tn8 AT moe DOT cc DOT utexas DOT edu> Reply-To: djgpp AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: djgpp AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk > From: "Henry Churchyard" > Newsgroups: comp.os.msdos.djgpp > Date: 3 Aug 2001 16:18:41 -0500 > > > > DJGPP specific changes. > > ======================= > > - Eli Zaretskii contributed a patch to open the input stream in > > binary mode on platforms, like DOS/WIN95, that distinguish between > > text and binary files. This will allow to process files that > > contain embedded ^Z and lone ^M characters. This patch has already > > been submitted by him to the sed maintainer, so this feature may > > become a standard feature in the next official sed release. Thanks > > to Eli Zaretskii for contributing this. > > That's completely the wrong way around; DOS doesn't make any > distinction whatsoever between binary and text files at the > file-system level. What happened was that way back when, Unix adopted > a somewhat non-standard and idiosyncratic definition of "text" > (i.e. delimited by LF only), while MS-DOS fully followed the relevant > standards and adopted a standard definition of text (delimited by > CR-LF). (If you don't believe me, look at all the RFC's governing > Internet protocols -- if they're text-based, such as SMTP, then they > specify CR-LF line endings.) So this means that when a C compiler is > moved over to MS-DOS, it has to have two clunky file-handling modes, > one of which translates from standard MS-DOS text format to the > compiler's internal non-standard C/Unix text format That's true, except that the special meaning of ^Z doesn't come from the Unix text notion, but rather from CP/M. In any case, the practical implications of what Juan wrote are still valid: the previous ports of Sed couldn't handle lone ^M and embedded ^Z characters, while this one can.