delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2010/05/20/12:05:47

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
In-Reply-To: <20100520123926.GA1432@onderneming10.xs4all.nl>
References: <20100520123926 DOT GA1432 AT onderneming10 DOT xs4all DOT nl>
Date: Thu, 20 May 2010 19:05:17 +0300
Message-ID: <AANLkTilpbuyiJIswTZGQN5jsHsK793ITUP9pcB95Hf1l@mail.gmail.com>
Subject: Re: sed doesn't like LANG= anymore
From: Andy Koppe <andy DOT koppe AT gmail DOT com>
To: "cygwin AT cygwin DOT com" <cygwin AT cygwin DOT com>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Thursday, May 20, 2010, Jurriaan wrote:
> A very long sed script that's been working for ages (back from the 1.5
> age) here has stopped working.
>
> It turned out sed doesn't like some strings anymore when environment
> variable LANG is empty. With LANG=ASCII, there are no problems.
>
> The actual text in the SED command is shown below as spaces, but it's a
> Swedish a with a small o on top of it, like this:
>
> sed -e"s/@a/ a/g;"
>
> where a is character 0xe5.
>
> Running with LANG=ASCII works, with LANG empty I get 'unterminated `s'
> command' from sed (which confused me for a while).

With empty LANG you're using the default UTF-8 encoding, where that
0xe5 byte constitutes an incomplete character. You need to either run
with a LANG setting that fits your script, e.g. C.ISO-8859-1, or
convert your script to UTF-8. I'm puzzled as to why LANG=ASCII would
have worked, since that's not a valid setting.

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019