Date: Thu, 29 Jul 1999 12:53:44 -0400 Message-Id: <199907291653.MAA20852@envy.delorie.com> From: DJ Delorie To: djgpp AT delorie DOT com In-reply-to: <199907291640.LAA07707@darwin.sfbr.org> (message from Jeff Williams on Thu, 29 Jul 1999 11:40:55 -0500 (CDT)) Subject: Re: about dtou and utod References: <199907291640 DOT LAA07707 AT darwin DOT sfbr DOT org> Reply-To: djgpp AT delorie DOT com X-Mailing-List: djgpp AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk > [1] Take a DOS CR/LF/^Z text file; apply `dtou' and get a unix NL > version (sans ^Z); apply `utod' to this and recover the original DOS > version *except* for the ^Z EOF marker. This has *not* been a problem; > but I noticed it while working on a text filter. Could someone explain > why this happens (i.e., why the ^Z doesn't reappear)? The ^Z is an optional part of a text file. Thus, it must be stripped when converting to unix format, but need not be added when converting to dos format. Most dos text files don't have a ^Z at the end. > [3] Related to [2]: is there a way to detect whether a text file is > in unix NL format or DOS CR/LF/[^Z] format, preferably within from > a bash script? try "grep ^M && echo dos text file" > [4] The main reason any of this matters to me is because I move lots of > files back and forth from work (Solaris 2.7) to home (djgpp). I was I use a C program or perl script to detect non-ascii characters (like NUL or 0x80-0x9f) to select text vs binary files (html vs gif for example).