X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org MIME-Version: 1.0 In-Reply-To: <4F39691C.1030807@gnu.org> References: <2BF01EB27B56CC478AD6E5A0A28931F203F560E4 AT A1DAL1SWPES19MB DOT ams DOT acs-inc DOT net> <20120210150708 DOT GA22832 AT calimero DOT vinschen DOT de> <20120211100600 DOT GA9823 AT calimero DOT vinschen DOT de> <4F391A38 DOT 6000505 AT redhat DOT com> <4F392012 DOT 80101 AT gnu DOT org> <20120213145612 DOT GA8858 AT calimero DOT vinschen DOT de> <4F392ABC DOT 1040309 AT gnu DOT org> <20120213194234 DOT GA4177 AT mercury DOT ccil DOT org> <4F39691C DOT 1030807 AT gnu DOT org> Date: Mon, 13 Feb 2012 15:16:40 -0500 Message-ID: Subject: Re: sed strips CRs From: Earnie Boyd To: cygwin AT cygwin DOT com Cc: John Cowan , bug-sed AT gnu DOT org Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id q1DKHLtA011997 On Mon, Feb 13, 2012 at 2:48 PM, Paolo Bonzini wrote: > > If you meant that "rt" should be restricted to cygwin, that's also fine by > me but in general I prefer feature tests to OS tests. > Then it becomes Cygwin's problem. I'm going to quote from http://msdn.microsoft.com/en-us/library/yeby3zcb.aspx t Open in text (translated) mode. In this mode, CTRL+Z is interpreted as an EOF character on input. In files that are opened for reading/writing by using "a+", fopen checks for a CTRL+Z at the end of the file and removes it, if possible. This is done because using fseek and ftell to move within a file that ends with CTRL+Z may cause fseek to behave incorrectly near the end of the file. In text mode, carriage return–linefeed combinations are translated into single linefeeds on input, and linefeed characters are translated to carriage return–linefeed combinations on output. When a Unicode stream-I/O function operates in text mode (the default), the source or destination stream is assumed to be a sequence of multibyte characters. Therefore, the Unicode stream-input functions convert multibyte characters to wide characters (as if by a call to the mbtowc function). For the same reason, the Unicode stream-output functions convert wide characters to multibyte characters (as if by a call to the wctomb function). So does Cygwin really want to specify "rt"? I would rather sed specify "rb" and treat the CR as white space. I know that treating CR as white space works well. -- Earnie -- https://sites.google.com/site/earnieboyd -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple