delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/09/02/16:10:55

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SPF_PASS
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
In-Reply-To: <4A9E5B9A.4010701@byu.net>
References: <416096c60908300959i1e0084b1xc8f6e65e792b035d AT mail DOT gmail DOT com> <20090831005258 DOT GG2068 AT ednor DOT casa DOT cgf DOT cx> <416096c60909012329l2f25e735yc07145b8d6698cda AT mail DOT gmail DOT com> <4A9E5B9A DOT 4010701 AT byu DOT net>
Date: Wed, 2 Sep 2009 21:10:42 +0100
Message-ID: <416096c60909021310v40941791r5fb273ab04b51481@mail.gmail.com>
Subject: Re: The C locale
From: Andy Koppe <andy DOT koppe AT gmail DOT com>
To: cygwin AT cygwin DOT com
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

Eric Blake:
>> A rather important exception is 'ls', which seems to have its own
>> hardcoded limitation to 7 bits for the C locale: anything non-ASCII is
>> shown as '? there'.
>
> That's only because the current build of cygwin ls pre-dates a lot of the
> locale support. =C2=A0I'm hoping that when I get time to build coreutils =
7.5,
> that ls will start printing characters marked printable in the current lo=
cale.

Don't worry, on 1.7 it already works fine in locales other than "C".
And it turns out that the restriction with the latter is due to newlib
being inconsistent: whereas the conversion functions use ISO-8859-1,
the ctype functions insist on ASCII, i.e. the isbla() functions return
0 for anything above 0x7F.

So in the C locale we've currently got UTF-8 for filenames, ISO-8859-1
for the console and multibyte conversions, and ASCII for the ctype
functions.

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019