X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=0.0 required=5.0 tests=BAYES_50 X-Spam-Check-By: sourceware.org Reply-To: From: "Michael Moser" To: Subject: sed converts 8-bit input text to 16-bit (Unicode-16?) characters - how to suppress that? Date: Mon, 30 Mar 2009 13:48:03 +0200 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com I need to mangle a file containing "8-bit ASCII" characters (i.e. the file contains also characters in the upper 8-bit range, namely a few umlauts as well as some french accented characters).=20 Strange enough, the SED version that came as part of cygwin emits the result of the mangling using 16-bit characters (I believe those are Unicode-16 characters, but not sure. The Hexeditor shows each second byte as always 00, execpt for the first two bytes which read FF FE). Alas, this makes the next program in the chain to throw up and die. How can one suppress this conversion? I found no option or flag to tell SED to stay with 8-bit characters. Just in case: I need this only to strip some trailing blanks and convert tabs to spaces, etc. the conversion doesn't need to do anything with those characaters that have the 8th bit set (except that it needs to maintain them as is). Michael -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/