delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2023/03/24/08:19:22

X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A345238708BD
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1679660325;
bh=DnNfg6/roiTiFf1ntKiLsjPEUATTEzmFA7HLEJfaVsU=;
h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
From;
b=D/HuGU/C+5/NfmxklVvZCid7JvWbqu5t21op3b2aHC/6I/Ge7dJF2m6BiPlw+a+Gv
UoVfjc0ghvxXCcR2TJpTipuGsx4Jz91mRyI0axV7DIQEiycqTPfON0cdV97r/l4G2m
zV0WkAjgl8IyomSyyiUcVMuNv5vlvjaErj5gumO4=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7020B3858CDA
Date: Fri, 24 Mar 2023 13:18:04 +0100
To: cygwin AT cygwin DOT com
Subject: Re: newlocale: Linux incompatibility
Message-ID: <ZB2U/JCFrwSUo1+U@calimero.vinschen.de>
Mail-Followup-To: cygwin AT cygwin DOT com
References: <bd7ebebc-cf90-d509-ef15-11d702a6126c AT cornell DOT edu>
<ZBzBL2jFO7Oltjd1 AT calimero DOT vinschen DOT de>
MIME-Version: 1.0
In-Reply-To: <ZBzBL2jFO7Oltjd1@calimero.vinschen.de>
X-Provags-ID: V03:K1:rujRI/gS/cOnuOZbl6CZpADM6OD7qEs66b9N/KELUt8j9KO0sPo
KU1gXHaVj/fYGMgLuabeVCnFO98bd94TcmVv0YnmChNDoV2nvBC7cjutD/37zV1YYDCGdQr
F52FBdCZeWuz31S7d6Dn9xkAX87qbeiDaa88pGnsaZP4BSqDQ/CMqdE1jBM0K/XJlxdH13O
KDilF5nfy6degxLHt34qA==
UI-OutboundReport: notjunk:1;M01:P0:Xeq7BjF7D80=;u6GzJFLVnN9EcH20lWc+gpjjysI
gUZ64lIx/gweTG8s01VDTh239AJVIU0lhE+TbVAlYLnrDpTsF/FMhF1VO665XF05XuPmlkNdi
C1Dvl5HyN9E1dMGKCmezVWPbdpm4C8WJR0V7HyfrDiLnDaHmm8vCP6HIrB1+8hgtLoNBgN33F
rAl7I5Cp0QH5+ss/loYaRO+CG2g1eNatBo/VjM+Ytip4B8mV0vrFXwNPRYEJy/pYBkOVG1aNZ
NCBzqLcXCmJuvP+upq8kTy12TsY7wiQK4DDq2fU+5wDW/sx4e+M3exdJ3wOKZJBdfUZdCDcIR
OaFBrFWxKIaZ7GCmD70nVePIoOb7LzIv+ZZcNqdmgwkN7qAg4SwDLRI2DH85C8k0rNq7L6skl
WFCe/GGKFFC76TmY+7YDwd7cvKHtWyXMqY8RjgAecTB1pEL1zyBYWH/SRPefojQp8gfsjL7eF
rvx6OxOexBOwaH/dGiOcCMs1o4C0vuGl+AVi7Vb39Ul6y9R6VfbyxASvamOBX7KjYBZbtJ+Vv
nW2FjE7zE++m9LVd4KEFs1/+mZHIylOu9zQoMNay14IM+XV5VoaoF3h3EJuMeLyQ/CynlzYBc
+CmQ1JBGQfWs5AMD23aUBjZyyIx5CR8cI/8XIdVZSvRf5bufncF2qMCdKsmq2U0YkYiNjniE9
oUVFMmL4x62xf30O6VtRd2zweF7Z70arIc5gfn5NlQ==
X-Spam-Status: No, score=-97.6 required=5.0 tests=BAYES_00,
GOOD_FROM_CORINNA_CYGWIN, KAM_DMARC_NONE, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE,
RCVD_IN_MSPIKE_H2, SPF_FAIL, SPF_HELO_NONE,
TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Corinna Vinschen via Cygwin <cygwin AT cygwin DOT com>
Reply-To: cygwin AT cygwin DOT com
Cc: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>

On Mar 23 22:14, Corinna Vinschen via Cygwin wrote:
> On Mar 23 15:48, Ken Brown via Cygwin wrote:
> > I'm reporting this here rather than the newlib list because the behavior is
> > compatible with Posix but not Linux, so I think it's a Cygwin issue.
> 
> Actually, it's a Windows issue :)
> 
> > Consider the following test case:
> > 
> > $ cat locale_test.c
> > #include <stdio.h>
> > #include <locale.h>
> > 
> > int main ()
> > {
> >   const char *locale = "en_DE.UTF-8";
> >   locale_t loc = newlocale (LC_COLLATE_MASK | LC_CTYPE_MASK, locale, 0);
> >   if (!loc)
> >     perror ("newlocale");
> >   else
> >     printf ("newlocale succeeded on invalid locale %s\n", locale);
> > }
> > 
> > $ gcc -o locale_test locale_test.c
> > 
> > $ ./locale_test.exe
> > newlocale succeeded on invalid locale en_DE.UTF-8
> > 
> > On Linux, the newlocale call fails with ENOENT, as is documented on the man
> > page.  Posix doesn't say what should happen on an invalid locale, so this is
> > not, strictly speaking, a bug.
> 
> Three bugs in fact.
> 
> First, it's a bug in the Emacs testsuite.  The test simply assumes that
> there's no en_DE locale on any system, but that's just not true.
> Windows support the RFC 5646 locale "en-DE", which is called "English
> (Germany)" in the "Region" settings.
> 
> You can also check with `locale -av | less' and search for en_DE.
> 
> For the reminder of this mail, I assume you're talking about Cygwin 3.5.
> I won't fix this for 3.4 anymore, given how much locale handling has
> changed for 3.5.
> 
> The second bug is that Cygwin blindly trusts the Windows function
> ResolveLocaleName().  That function blatantly converts even vaguely
> similar locales into something it supports.  E.g., it converts "en-XY"
> to "en-US".  I. .e., even if you use "en_XY.utf8" as locale, the above
> testcase will wrongly succeed.  So I have to rethink how I resolve POSIX
> locales to Windows locales.
> 
> And the third bug is that Cygwin fails to set errno if it doesn't
> support a locale, but that's a minor inconvenience in comparison.
> 
> Thanks for the report, I totally missed the above problem with
> ResolveLocaleName.

I pushed a couple of patches which hopefully clean up the code.  It's
really frustrating how these Windows locale functions work.  Or, rather,
not work.  I mean, come on...

- ResolveLocaleName() resolves "ff-BF" to "ff-Latn-SN", not to
  "ff-Adlm-BF" or "ff-Latn-BF", even though both exist.  

- There's a locale called "sd-Arab-PK" and a locale "sd-Deva-IN".  If
  you ask for the script used in "sd-IN", the result is "Arab", not
  "Deva".

/*facepalm*/

I had to create a replacement function for ResolveLocaleName which
doesn't return totally screwy and unexpected results, and special case
two more locales in /proc/locales output so the output makes sense.

Oh, and I added error handling to the code so newlocale is now able to
set errno to ENOENT if the locale is not supported.

If you want to test this, the changes are in test release
3.5.0-0.260.gb5b67a65f87c, which is just building.


HTH,
Corinna

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019