Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Date: Fri, 30 Aug 2002 10:19:38 +1000 (EST)
From: luke DOT kendall AT cisra DOT canon DOT com DOT au
Subject: Re: Bug? Mixed CR/LF and LF line endings from different programs
To: cygwin AT cygwin DOT com
In-Reply-To: <20020829203820.GB23580@redhat.com>
MIME-Version: 1.0
Content-Type: TEXT/plain; CHARSET=US-ASCII
Message-Id: <20020830001556.9D66C8B1F@bellmann.research.canon.com.au>

On 29 Aug, Christopher Faylor wrote:
>  On Thu, Aug 29, 2002 at 09:21:10AM -0700, Shankar Unni wrote:
>  >Christopher Faylor wrote:
>  >>awk and sed open their standard input in textmode.  This is by design.
>  >
>  >Don't they open their stdout in textmode, then?  Otherwise they should
>  >have been "fixed up back" to \r\n when they wrote the lines out, no?
>  
>  I think you can draw your own conclusions on what is happening pretty
>  easily.

Yep.  I don't understand why, though.  Outputting lines in the style
matching - depending on the mount type - DOS or Unix, seems correct to
me.  Yet awk and sed don't do it.  This suggests that they're not
opening the output file in text mode, which seems to contradict the
Cygwin FAQ:

    It is rather easy for the porter to fix the source code by supplying
    the appropriate file processing mode switches to the open/fopen
    functions. Treat all text files as text and treat all binary files
    as binary. 

Since awk and sed work on streams, I doubt that they'd be doing lseeks,
so in fact I can't imagine why they're not opening *input* files in text
mode, too.  They *are* text processing tools, after all.

>  I just wanted to make sure that people understand that the behavior is
>  not a random event.  It comes up from time to time here and I thought
>  that it bears repeating that both are working the way they are designed
>  to work.

I understand that it's not random, but now I'm at a loss to understand
why that behaviour was chosen.

So, what's the recommended way for using Cygwin for any sort of text
processing?  Files you create with the standard Unix tools will have a
mixture of different kinds of line endings, with the current design.

If you work in Unix mode this wouldn't happen, but then *all* the files
you produce won't be acceptable to lots of native applications, since
they won't have the native line ending.

If I write a program that filters input to output converting all line
endings to the native style, that would "solve" this problem, but it
means that every script will have to be altered, replacing almost every
occurrence of ">" by "| sanitise >", which really is out of the
question.

Ah, hang on, I've just been poking through the user guide and found the
CYGWIN=nobinmode option, that makes everything work as I would have
expected.

Whew!

luke


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/