delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2015/12/24/14:16:01

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; q=dns; s=
default; b=rtjprQzJU9W/Ob1TzxELmolau8D0dbJjw/pBiZfaks/FnK4Likfpx
dLXxquXIlMr7ZshGQiqDpAh9LQ2TLusve3htKeVQff2XWDfr5sQFNZMkBVqMYDEW
5ZQNJaYCQzTVUjqaX3q0s8ujbadEhdyEsmX2GMDK1xW+cNH9M6uze0=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; s=default;
bh=Z65k075J6HY2D9RZM6Fl7NIaDFY=; b=Y3n4wMpfu/hb/xQ98s+X97LS5hs+
L5k8/14vNEj7V9gG0ocKVksIYJBLi5oLcHgUwGjOvs7Buu3RMdLw3yJYx2+gj0BQ
zmyI6Ov+WdMG6YyQEEipRfmTQZUxYFBpKUUmy1FssMoUAZhe0H0h7tCNY+qRdu5g
jYq055XIs8Ok0ko=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-101.5 required=5.0 tests=AWL,BAYES_50,KAM_ASCII_DIVIDERS,KAM_LAZY_DOMAIN_SECURITY,KHOP_DYNAMIC,RCVD_IN_PBL,RDNS_DYNAMIC,USER_IN_WHITELIST autolearn=no version=3.3.2 spammy=Default, Shcheglov, shcheglov, Bass
X-HELO: calimero.vinschen.de
Date: Thu, 24 Dec 2015 20:15:42 +0100
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Default locale for Russian/Russia should be ru_RU.CP1251
Message-ID: <20151224191542.GA4275@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <567C1207 DOT 3020700 AT gmail DOT com>
MIME-Version: 1.0
In-Reply-To: <567C1207.3020700@gmail.com>
User-Agent: Mutt/1.5.24 (2015-08-30)

--OgqxwSJOaUobr8KG
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Dec 24 18:40, Andrey ``Bass'' Shcheglov wrote:
> Hi,
>=20
> I'm running Cygwin 2.2.0 on an English Windows 8.1 box:
>=20
> > CYGWIN_NT-6.3 UNIT-725 2.2.0(0.289/5/3) 2015-08-03 12:51 x86_64 Cygwin
>=20
> Windows regional settings are set to Russian/Russia.
>=20
> In the absence of any settings in bashrc/bash_profile, `locale` command
> outputs the following:
>=20
> > LANG=3Dru_RU
> > LC_CTYPE=3D"ru_RU"
> > LC_NUMERIC=3D"ru_RU"
> > LC_TIME=3D"ru_RU"
> > LC_COLLATE=3D"ru_RU"
> > LC_MONETARY=3D"ru_RU"
> > LC_MESSAGES=3D"ru_RU"
> > LC_ALL=3D
>=20
> This is perfectly fine, except that "no charset" in the locale output
> means "ISO charset", which is ISO-8859-5 for Russian/Russia and has
> never been used (historically, DOS used CP866, Windows used CP1251 ANSI
> codepage, and various Unices sticked to KOI8-R before the rise of
> Unicode era).

Well, not quite.  Cygwin is following Linux here:

  linux$ locale -av
  [...]
  locale: ru_RU           archive: /usr/lib/locale/locale-archive
  ----------------------------------------------------------------------
      title | Russian locale for Russia
     source | RAP
    address | Sankt Jorgens Alle 8, DK-1615 Kobenhavn V, Danmark
      email | bug-glibc-locales AT gnu DOT org
   language | Russian
  territory | Russia
   revision | 1.0
       date | 2000-06-29
    codeset | ISO-8859-5

  cygwin$ locale -av
  [...]
  locale: ru_RU           archive: /mnt/c/WINDOWS/system32/KERNEL32.DLL
  ----------------------------------------------------------------------
   language | Russian
  territory | Russia
    codeset | ISO-8859-5

> Cygwin docs state that
>=20
> > Starting with Cygwin 1.7.2, the default character set is determined by =
the default Windows ANSI codepage for this language and territory.

You missed to read on:

  Cygwin uses a character set which is the typical Unix-equivalent to
  the Windows ANSI codepage.  For instance: [...]

> which is not true in my case (Windows ANSI codepage for Cyrillic is
> CP1251, not ISO-8859-5!).

Rephrasing the above, Cygwin only uses the ANSI codepage to fetch the
default Linux codepage from there.  Maybe the documentation is a bit
fuzzy, but it didn't say the charset is set *to* the Windows ANSI
charset, it just *uses* the information to compute and set the codeset
to the equivalent Linux codeset.

> Surprisingly, for Belarusian (a.k.a
> Belorussian, Eastern Slavic language very close to Russian) "be_BY"
> locale the default charset is indeed CP1251 which is in accordance with
> both the documentation and common sense.

See the docs:

  The default charset of the "be_BY" locale (Belarusian/Belarus) is CP1251.
  With the "@latin" modifier it's UTF-8.

Just as on Linux.

> Despite that, $(locale -u) returns "en_GB", despite all regional
> settings are set to Russian/Russia. I believe this is not correct,
> either, and needs to be fixed.

The locale is directly taken from the Windows system function
GetUserDefaultUILanguage() in case of the -u option(*), and from
GetUserDefaultLCID() in case of the -f option(**).  This value is then
fed into the Windows function GetLocaleInfo()(***) to fetch language and
territory codes and that's what locale -u/-f prints.

So, looks like you're using a UK-english system with just the region
settings changed to Russia.

In general UTF-8 is the preferred codeset so setting LANG to ru_RU.utf8
(locale -fU should work for you) is the better choice.


Corinna

(*) https://sourceware.org/git/?p=3Dnewlib-cygwin.git;a=3Dblob;f=3Dwinsup/u=
tils/locale.cc;h=3Dfadf3f3dacedad6474c92aabe826620b2677e494;hb=3DHEAD#l805

(**) https://sourceware.org/git/?p=3Dnewlib-cygwin.git;a=3Dblob;f=3Dwinsup/=
utils/locale.cc;h=3Dfadf3f3dacedad6474c92aabe826620b2677e494;hb=3DHEAD#l812

(**) https://sourceware.org/git/?p=3Dnewlib-cygwin.git;a=3Dblob;f=3Dwinsup/=
utils/locale.cc;h=3Dfadf3f3dacedad6474c92aabe826620b2677e494;hb=3DHEAD#l114

--=20
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

--OgqxwSJOaUobr8KG
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJWfERdAAoJEPU2Bp2uRE+gCQkQAJeomXfkG6qxHR/LrFs32A+J
xpXIj3MLiJbeRxyf95oBhfjtXEfy84bJCIszA1pNfuljI7PJUgKqZjuMiFQqdgVk
0Zg9sqU2+uhosmfVKThTHWoQeUnaTBPcV3eHc0uCnRvxJ+6ZxmpGUVQEqyWCjdb2
jxw8WkT0jExfWFTJld5x5fSIvhpNgg+l9tImWpTuwXHq0IpiwcmvJ7y+kP8IRgns
3QMLop/o45yKDhHKcJuM6cwmEl+lHYigst7ekQUt4WN/j0KSO3CW9fhqW6pLlWVM
CzV4xaYRPG5ML2rdK1yT4QYWLP+zZMG18n2QJAXU2t3gjTZ8fwjmzHSveduYMvVr
QyEjYnna8y27qmmG8qcaUuRcYe8VRvYQ6g9OXw7rzKwxTeJTQkHi6mwC6XlBsi35
/4ChHyU58LBFOubGwJSyicJoQX/2LTQv83JmZmdsIbLAKMnGKsjdKlMArumUTedU
fMKR6ZIBud50yMRyHjJ+uYfgC+mPWJ6WU0AiH4GiVIJEmRTF/Bjxt7UGmPqncxa9
0/h+x0jG1YajauMskGb97J9LjL//chujElWxlZPQP4VEILQsLLI54C8lGD/ZUsQD
lZaSBs5CW4Iz9MQL1ofH84Gg1tpK/AEKMG/ZzK5M1YOxZ+kxVwRu7kqS2bMHoCSQ
r7plGjBOD/IvDtPlXjc6
=Yz+p
-----END PGP SIGNATURE-----

--OgqxwSJOaUobr8KG--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019