delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2021/11/25/07:55:30

X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4852D3858403
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1637844929;
bh=dbM21zcFmd4iOBnLIKN6UnABBNuNulLNXmnbXtfXTW0=;
h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
From;
b=nbA0OIDa2TIjpHQUD7TP6lrE9QZ3XgQCUsvjRyySWcbJLjdvqNjOtoEBGTXN5bFxr
hSQvtBOQuLp2JEROFyWmqdibV6D07AWDQw6HghXX/GY8vZRI5REf1tY0xLDYWq+yr9
d62ALwSNtzHBU2UjUj86bhFD1FLTwAGJaRUGz+5w=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C954B385843B
Date: Thu, 25 Nov 2021 13:54:42 +0100
To: cygwin AT cygwin DOT com
Subject: Re: raise(-1) has stopped returning an error recently
Message-ID: <YZ+HkgPIwmCuTcJr@calimero.vinschen.de>
Mail-Followup-To: cygwin AT cygwin DOT com
References: <YZsoj6UvpF6pcbtt AT slk1 DOT local DOT net>
<YZtwMZ1LUbx+b5+s AT calimero DOT vinschen DOT de>
<YZuVy5+nbzPtiqdw AT calimero DOT vinschen DOT de>
<YZyl69ODRcBVnMed AT slk1 DOT local DOT net>
<YZy5bRsZuulb6FUV AT calimero DOT vinschen DOT de>
<42c9bb90-dd78-edfa-99ff-f65f7e000956 AT SystematicSw DOT ab DOT ca>
<YZ1tAfzwlW8C84z4 AT slk1 DOT local DOT net>
<YZ4FGpEDDar45HC7 AT calimero DOT vinschen DOT de>
<643c1cb7-9b18-25cf-62b0-8085c8fab137 AT Shaw DOT ca>
MIME-Version: 1.0
In-Reply-To: <643c1cb7-9b18-25cf-62b0-8085c8fab137@Shaw.ca>
X-Provags-ID: V03:K1:O69r4blocL2mhs4Zc2CibcTGt4ZETpDI9ENebpi7W2DQLyVy/Dt
+LdlORXlkp0bOIISUTIPC+DKCulcAN9yvdAxioNLdel+bsKtmL2SwGiJjNbiym3l0/Bmvi0
D/7/SEEXK1sSZSg4m56vSzK8a4M3It3yW+fjFVT6Z/VPg/wPFQigMuBMU2X3zlunk1XEg4C
skM+Gvqa5+zxPdwkdL99g==
X-UI-Out-Filterresults: notjunk:1;V03:K0:bYwmlDtpJ4s=:fus8fCR/f1ERpQ0fbvqDvQ
fre3K50mrQV2cxQf7SBnXUFXmTX4MVkGNbx2oMdSO7Zl6jNHLxnK39FrVtUpKWlJwZlwczPkS
8JhV8ytKOkANLf8dOI0yZ/BZ+u9D22iyWt5L5JfWSxn1VC0YZ7ZFg/S3Ck49braleEnbGrIQ7
NvPl4xbbu9vCPGhPmBLQGK5tSBAV7Tt6FxCgS/WZa+dIasHhO9hIpQN2YT/24zfqeC/vzdaRV
fRWVBG1s1LArf+Itk+FHy9QksSjSeNDPFv64ZKNzTYjfs5aN15M0+GYBEPA92uufHnMcwsfYa
ZkTlSfH2+Yo/9oShe4wOtAi2uD+VgO05Za3bDEV5oR0fOqhYrLFJv4FNxq0LEgrTqtxWgEv8t
VH1OgRz53GzcBjvIKZoUOgCA09WEA/hBilBtrwI16eIoFrCE7Ut5maoTwxB7legGqImiLCAcN
ODnnABJAzxAaS4nv0EJJb/AWcDX2I1Gh0c47/1Bvf3T2rsK4ffzgfC2Y/sPc6DIHNlIqo2EMf
REpKl2dwIS7xWOQgV5uJkx0TdlqY2fI6OY1NK3ECVLi9Pw8I887VV3b5QeRUGQd9avLZHe6Of
x1X5GVAXyQFQKghBREy/r+94M+el5BK5K7IHypT1WUxZKDpbqIWQ9hLmEVtCthfjlTN32vu8a
PRVwM3saTMV4avttS2tFljg6Xds/VagUwXHJZfRBVAseObQEjPIW7PAqdnLkq2aX+3UnH7wbR
c9vigueUnci/SH+B
X-Spam-Status: No, score=-99.4 required=5.0 tests=BAYES_00,
GOOD_FROM_CORINNA_CYGWIN, KAM_DMARC_NONE, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE,
RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NEUTRAL,
TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Corinna Vinschen via Cygwin <cygwin AT cygwin DOT com>
Reply-To: cygwin AT cygwin DOT com
Cc: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>

On Nov 24 11:01, Brian Inglis via Cygwin wrote:
> On 2021-11-24 02:25, Corinna Vinschen via Cygwin wrote:
> > > On Tue, Nov 23, 2021 at 11:18:25AM -0700, Brian Inglis wrote:
> > > > Do Cygwin and/or Windows support surrogate pairs in UTF-8?
> > 
> > You mean UTF-16.  UTF-8 doesn't know surrogate pairs, UTF-16 does.
> > Originally there was UCS-2, 16 bits, with only 65536 code points.
> > However, Unicode left the BMP already with version 2.0 in 1996, so
> > UTF-16 and surrogate pairs became necessary.  Windows as well as Cygwin
> > support them.
> 
> How does Cygwin support UTF-16 locales with surrogate pairs?

UTF-16 locales?  There's no such thing.  UTF-16 is just the 16 bit
representation for Unicode, and as such, is independent of the locale.
On the user side, Cygwin only supports UTF-8 as Unicode representation.
Internally you can then convert them to wchar_t which is UTF-16.

> Are they the "native" locales inherited from Windows if others are not
> specified e.g. UTF-8, some OEM SBCS or MBCS?

Just try `locale -av' and you'll see all supported locales and their
respective default codeset.  All of them can be used with .utf8
specifier to use UTF-8 instead of the default codeset.  Some of them
use UTF-8 as default codeset anyway, e. g., fa_IR or yo_NG.

> > > There are 3 tests in surrogate-pair and only the 3rd one failed. So I guess
> > > surrogate pairs in UTF-8 "mostly work".
> > 
> > UTF-16.  The surrogate stuff is evil at times.  Have a look at the
> > __utf8_wctomb function in
> > https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/stdlib/wctomb_r.c
> > Lone surrogate halfs in an input stream are a problem, for instance.
> 
> Thus the confusion with grep surrogate pair tests which appear to be running
> under a UTF-8 locale: see attached surrogate pair extract from cygport
> --debug grep.cygport check.

An STC in plain C might be helpful.


Corinna

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019