X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00,SPF_PASS X-Spam-Check-By: sourceware.org Message-ID: <4ACCC0BC.4050204@freesbee.fr> Date: Wed, 07 Oct 2009 18:24:28 +0200 From: =?ISO-8859-1?Q?Vincent_Rivi=E8re?= User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: Additional carriage return added by cygwin commands to DOS text files References: <4ACCB085 DOT 3070304 AT freesbee DOT fr> <4ACCB4AE DOT 8030409 AT freesbee DOT fr> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com ttjqryfbndgdx wrote: > Note that I don't have the issue with cat. > bash-3.2$ cat test1 > test2 > bash-3.2$ xxd test2 > 0000000: 6161 610d 0a62 6262 0d0a aaa..bbb.. "cat" consider input and output as binary. So the syntax "cat a > b" is always equivalent as "cp a b". Now if you think that cat should consider the files as text, telling Cygwin to remove CR on input and add them on output: There is an error on input (the CR are not removed) and an error on output (they are not added). The 2 errors cancel themselves, so the result is still good. > I don't have it with sort used alone : > bash-3.2$ /usr/bin/sort test1 > test2 > bash-3.2$ xxd test2 > 0000000: 6161 610d 0a62 6262 0d0a aaa..bbb.. "sort" open both input and output as text, it is what I call a "good text filter", like "more". > But get it when using sort in a pipe with cat : > bash-3.2$ cat test1 | /usr/bin/sort > test2 > bash-3.2$ xxd test2 > 0000000: 6161 610d 0d0a 6262 620d 0d0a aaa...bbb... "cat" opens test1 in binary: error on input. The unexpected CRs goes into cat memory, then into the pipe, then into the sort memory, then into the output file, where additional CR are inserted, because sort use text-mode output. > But using more instead of cat solves the issue : > bash-3.2$ more test1 | /usr/bin/sort > test2 > bash-3.2$ xxd test2 > 0000000: 6161 610d 0a62 6262 0d0a aaa..bbb.. Same as sort. test1 is opened in text mode by more, CRs are automatically stripped. The correct data free of CR goes through "more" memory, the pipe, then "sort" memory. Then test2 is opened for output in text mode and the CR automagically appears. The key thing to understand is that when text files are opened using text mode (as they should always be), the programs never see the CR in memory. They are automatically stripped/appended by Cygwin when reading/writing into real files. Note that pipes (unlike real files) always contain binary data, without CRs. No mystery (but hard to understand at first). -- Vincent Rivière -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple