delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2012/07/20/07:53:31

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-5.1 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,KHOP_RCVD_TRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
In-Reply-To: <loom.20120720T122137-881@post.gmane.org>
References: <loom DOT 20120719T162931-952 AT post DOT gmane DOT org> <50086B4A DOT 6090801 AT laposte DOT net> <loom DOT 20120720T084207-511 AT post DOT gmane DOT org> <20120720082452 DOT GV31055 AT calimero DOT vinschen DOT de> <loom DOT 20120720T122137-881 AT post DOT gmane DOT org>
Date: Fri, 20 Jul 2012 12:53:05 +0100
Message-ID: <CAHWeT-bUM0eud4ht9RO-tcECmQMLXH2k4EEuVXrr7QqXUG=6Hg@mail.gmail.com>
Subject: Re: Internal echo of shell beaves (sometimes) different to external echo
From: Andy Koppe <andy DOT koppe AT gmail DOT com>
To: cygwin AT cygwin DOT com
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id q6KBrPRu025278

On 20 July 2012 11:46, Ralf wrote:
> My problem is not that the script is in ISO-8859-1, nor that the strings
> or ttt.txt are in ISO-8859.1. They have to be in ISO-8859-1 because all my
> scripts are in ISO-8859-1 and they are used together with Windows-Programs
> (in the DOS-Box) which read and write only ISO-8851-1.
>
> My Problem is to handle in Shell-Scripts strings which are coded in
> ISO-8851 (and line-endings which depend on relative/absolute filenames,
> mounting and so on) without rewriting all the stuff.
>
> So what't the best setting in cygwin to echo ISO-88591? I still don't
> unterstand why the internal echo behaves in a different way from the external
> echo.

It's because setting LC_ALL in a bash script is too late for the bash
process itself, which will be using the default C.UTF-8 locale unless
something else is set when bash is invoked.

When stuff is written to a console (but not a pty-based terminal), the
Cygwin DLL converts it from the process charset (UTF-8 in this case)
to UTF-16 to pass it to the relevant Windows API function. Your
ISO-8859-1 encoded 'ΓΌ' is an invalid byte when interpreted as UTF-8,
hence the error character.

/usr/bin/echo on the other hand is invoked as a separate process, with
LC_ALL already set appropriately, hence they're you're getting the
expected output.

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019