X-Spam-Check-By: sourceware.org Message-ID: <44D243B4.3050609@netacquire.com> Date: Thu, 03 Aug 2006 11:43:00 -0700 From: Joachim Achtzehnter User-Agent: Thunderbird 1.5.0.5 (Windows/20060719) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: line endings, file path names (was: Updated: sed-4.1.5-2) References: <44D0E959 DOT 70903 AT netacquire DOT com> <20060803075248 DOT GA23629 AT calimero DOT vinschen DOT de> In-Reply-To: <20060803075248.GA23629@calimero.vinschen.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Corinna Vinschen wrote: [JA wrote:] >> Thank you very much for this fix. It will make life easier for all of >> us who struggle with a mix of native and Cygwin tools. It is very much >> appreciated that as far as line endings are concerned the attitude >> taken by Cygwin developers is not "use POSIX line endings". > > Sorry, but that's not why I did it. My personal opinion is still > strongly on the "use POSIX line endings" side. Too bad. > I made the fix only so that other mailing lists don't suffer This is a strange reason for changing sed's functional behaviour, but since I like the outcome I won't complain. :-) > CRLF lineendings are in the top 10 of the worst ideas in the OS > business. I agree 100%, and I also agree that DOS path names were a horrendous idea too, but neither of these questions are at issue here. > and I'm seriously contemplating (for years) to just remove textmode > from Cygwin. This is where I disagree completely. From "CRLF was a bad idea" does not follow "hence we should not support it". This would just be sticking your head in the sand. Bad idea or not, you, or rather a text processing tool like sed, cannot avoid being faced by millions of documents that use CRLF and a few with Mac line endings too. The realization that it was a bad idea does not make these go away. The only realistic approach here, and more so with line endings than with the path name issue, is that taken by XML (about which I usually have no good word to say): 2.11 End-of-Line Handling XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters carriage-return (#xD) and line-feed (#xA). To simplify the tasks of applications, the characters passed to an application by the XML processor must be as if the XML processor normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character. With respect to "text mode" don't forget that this is also part of the ISO standard for C and C++, although those standards don't go as far as XML does. Another way to look at the issue: You can definitely always blame the whole mess on those who started the whole CRLF thing and I'm all on your side, but users of your tools will have to muddle through this mess one way or another. You can make it easier for your users by making the tools tolerate inputs that are affected by the mess that exists in real life, or you can make it difficult. If you take the latter route people will gravitate toward other tools in the long run. Cygwin has become as popular as it is because it helped get the job done, where the job is dealing with a mixed environment (POSIX-like behaviour in a non-POSIX environment). Joachim -- work: joachima AT netacquire DOT com (http://www.netacquire.com) private: joachim AT kraut DOT ca (http://www.kraut.ca) -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/