| delorie.com/archives/browse.cgi | search |
| X-Recipient: | archive-cygwin AT delorie DOT com |
| DomainKey-Signature: | a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id |
| :list-unsubscribe:list-subscribe:list-archive:list-post | |
| :list-help:sender:subject:to:references:from:message-id:date | |
| :mime-version:in-reply-to:content-type | |
| :content-transfer-encoding; q=dns; s=default; b=tpFN9ju7q0ub/bUJ | |
| ONT+DGnxRa2LsM/+Ngw/Mim+uABoI5Wl/bSR84Wand+dO5gT4GBCTXqHZ9ZOicYu | |
| JafO66+kaVyOuhi7yBqebSfogblasLeCR+TSMd2geuSfTQtz64wIXEQ6TdzxQSDQ | |
| Zk7Vfnsm66Zi4hviD6WHO7+EN2Q= | |
| DKIM-Signature: | v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id |
| :list-unsubscribe:list-subscribe:list-archive:list-post | |
| :list-help:sender:subject:to:references:from:message-id:date | |
| :mime-version:in-reply-to:content-type | |
| :content-transfer-encoding; s=default; bh=FtiYDRJA5SQOfxUTKxB/W5 | |
| 1+TW4=; b=t3s1b82MfkPxA7wFPkKDWmSy+9NXsRhb/Wmj3ziP1tdqeOza1rRJUm | |
| KpBCVcFuEDYCnJNM4vUbSbzDpv3wg+n5WzkWLtULVhU24TxEBF+GE5WVGJC5yEG0 | |
| KlS2gLaIxKMC8BLorNQlIW9O0NfeMQnstqS+g4oI2ADpE+9TJqYfo= | |
| Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
| List-Id: | <cygwin.cygwin.com> |
| List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
| List-Archive: | <http://sourceware.org/ml/cygwin/> |
| List-Post: | <mailto:cygwin AT cygwin DOT com> |
| List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
| Sender: | cygwin-owner AT cygwin DOT com |
| Mail-Followup-To: | cygwin AT cygwin DOT com |
| Delivered-To: | mailing list cygwin AT cygwin DOT com |
| Authentication-Results: | sourceware.org; auth=none |
| X-Virus-Found: | No |
| X-Spam-SWARE-Status: | No, score=0.9 required=5.0 tests=BAYES_50,FREEMAIL_FROM,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 spammy=Serbian, serbian, Default, H*c:koi8-r |
| X-HELO: | mail-wm0-f44.google.com |
| X-Received: | by 10.28.187.198 with SMTP id l189mr25010890wmf.89.1450981350895; Thu, 24 Dec 2015 10:22:30 -0800 (PST) |
| Subject: | Re: Default locale for Russian/Russia should be ru_RU.CP1251 |
| To: | cygwin AT cygwin DOT com |
| References: | <567C1207 DOT 3020700 AT gmail DOT com> |
| From: | Marco Atzeri <marco DOT atzeri AT gmail DOT com> |
| Message-ID: | <567C37D9.8090102@gmail.com> |
| Date: | Thu, 24 Dec 2015 19:22:17 +0100 |
| User-Agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 |
| MIME-Version: | 1.0 |
| In-Reply-To: | <567C1207.3020700@gmail.com> |
| X-IsSubscribed: | yes |
On 24/12/2015 16:40, Andrey ``Bass'' Shcheglov wrote:
> Hi,
>
> I'm running Cygwin 2.2.0 on an English Windows 8.1 box:
>
>> CYGWIN_NT-6.3 UNIT-725 2.2.0(0.289/5/3) 2015-08-03 12:51 x86_64 Cygwin
>
> Windows regional settings are set to Russian/Russia.
>
> In the absence of any settings in bashrc/bash_profile, `locale` command
> outputs the following:
>
>> LANG=ru_RU
>> LC_CTYPE="ru_RU"
>> LC_NUMERIC="ru_RU"
>> LC_TIME="ru_RU"
>> LC_COLLATE="ru_RU"
>> LC_MONETARY="ru_RU"
>> LC_MESSAGES="ru_RU"
>> LC_ALL=
>
> This is perfectly fine, except that "no charset" in the locale output
> means "ISO charset", which is ISO-8859-5 for Russian/Russia and has
> never been used (historically, DOS used CP866, Windows used CP1251 ANSI
> codepage, and various Unices sticked to KOI8-R before the rise of
> Unicode era).
>
> The above is consistent with locale charmap output, which is again
> ISO-8859-5.
>
>
> Short C example also confirms ISO-8859-5 is used:
>
>> #include <stdio.h>
>>
>> #include <locale.h>
>> #include <langinfo.h>
>>
>> int main() {
>> const char *locale = setlocale(LC_ALL, "");
>> const char *codeset = nl_langinfo(CODESET);
>> printf("locale: %s\n", locale);
>> printf("codeset: %s\n", codeset);
>>
>> return 0;
>> }
>
> outputs
>
>> locale: ru_RU/ru_RU/ru_RU/ru_RU/ru_RU/C
>> codeset: ISO-8859-5
>
>
> Cygwin docs state that
>
>> Starting with Cygwin 1.7.2, the default character set is determined by the default Windows ANSI codepage for this language and territory.
>
> which is not true in my case (Windows ANSI codepage for Cyrillic is
> CP1251, not ISO-8859-5!). Surprisingly, for Belarusian (a.k.a
> Belorussian, Eastern Slavic language very close to Russian) "be_BY"
> locale the default charset is indeed CP1251 which is in accordance with
> both the documentation and common sense.
>
>
> Additionally, in `strace locale -u` output, I see multiple
>> __get_lcid_from_locale: LCID=0x0419
> lines.
>
> "0x0419" corresponds to Russian/Russia (see
> <https://msdn.microsoft.com/en-us/library/windows/desktop/dd318693%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396>).
>
> Despite that, $(locale -u) returns "en_GB", despite all regional
> settings are set to Russian/Russia. I believe this is not correct,
> either, and needs to be fixed.
the current code on
winsup/cygwin/nlsfuncs.cc
is responsible for the ISO-8859-5 defaults.
--------------------------------------------------------------
case 1251:
if (lcid == 0x0c1a /* sr_CS (Serbian Language/Former
Serbia and Montenegro) */
|| lcid == 0x1c1a /* sr_BA (Serbian Language/Bosnia
and Herzegovina) */
|| lcid == 0x281a /* sr_RS (Serbian
Language/Serbia) */
|| lcid == 0x301a /* sr_ME (Serbian
Language/Montenegro)*/
|| lcid == 0x0440 /* ky_KG (Kyrgyz/Kyrgyzstan) */
|| lcid == 0x0843 /* uz_UZ (Uzbek/Uzbekistan) */
/* tt_RU (Tatar/Russia),
IQTElif alphabet */
|| (lcid == 0x0444 && has_modifier ("@iqtelif"))
|| lcid == 0x0450) /* mn_MN (Mongolian/Mongolia) */
cs = "UTF-8";
else if (lcid == 0x0423) /* be_BY (Belarusian/Belarus) */
cs = has_modifier ("@latin") ? "UTF-8" : "CP1251";
else if (lcid == 0x0402) /* bg_BG (Bulgarian/Bulgaria) */
cs = "CP1251";
else if (lcid == 0x0422) /* uk_UA (Ukrainian/Ukraine) */
cs = "KOI8-U";
else
cs = "ISO-8859-5";
--------------------------------------------------------------
> Regards,
> Andrey.
as temporary workaround can you use UTF-8 ?
export LANG=ru_RU.UTF-8
Regards
Marco
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
| webmaster | delorie software privacy |
| Copyright © 2019 by DJ Delorie | Updated Jul 2019 |