delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2017/08/05/16:53:50

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:message-id:date
:mime-version:in-reply-to:content-type
:content-transfer-encoding; q=dns; s=default; b=pgsYNS2vJXzCLY8t
kYwWx0ENWqikWPgCLgtywkpCnXLlbD09I6OaVJn5cKJO4ZERgnhu8iqhbCPV7PtI
I6D6Ftr/uktlkOwu0IfvzDWaiVDMf1A2Nw3w8Vw7ehocNwR3bAVAxUhPvvS4dr95
08RXme4uzfpGBDBJNYarKJNvxFo=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:message-id:date
:mime-version:in-reply-to:content-type
:content-transfer-encoding; s=default; bh=703804zvFjQEnefQYCBSk4
1nIcY=; b=LvLNs7vVPCJvoNbLtH3l7zQ6n2tT3fdogBmRk1bdTxpPUSgZjo6T2X
YwU4l0TbHxAWUe3BT1/u9kL7bVfvhMdKOCG4JjjA1SRepR70aLxtdZPVU5NcZ7wE
vMnDtTiYsjRKy1xEtVucspUIQxj3rO2fYzRapJORJ4sK0Jr1OZdac=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=2.0 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM autolearn=no version=3.3.2 spammy=particulary, nonunicode, UD:*.CP1252, non-Unicode
X-HELO: mout.kundenserver.de
Subject: Re: Unicode width data inconsistent/outdated
To: cygwin AT cygwin DOT com
References: <f3c1b415-7a26-8bbe-a67f-5619d356f058 AT towo DOT net> <20170726080859 DOT GA24312 AT calimero DOT vinschen DOT de> <5d3cb047-49f8-26a6-d816-387a71486e99 AT cygwin DOT com> <20170726095016 DOT GA25666 AT calimero DOT vinschen DOT de> <289bd98b-e644-888d-07f8-8965b6538373 AT towo DOT net> <20170728195826 DOT GI24013 AT calimero DOT vinschen DOT de> <1244bd24-bb27-d185-1f24-61beae02c2cd AT towo DOT net> <20170804170156 DOT GL25551 AT calimero DOT vinschen DOT de> <30486790-c59d-9a78-6000-b3c20fb86d9d AT towo DOT net> <1f320064-0f25-8a41-4ded-49bd750edae5 AT SystematicSw DOT ab DOT ca>
From: Thomas Wolff <towo AT towo DOT net>
Message-ID: <1018cbbf-e04d-3207-cafe-5a40c630bfa6@towo.net>
Date: Sat, 5 Aug 2017 22:53:22 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <1f320064-0f25-8a41-4ded-49bd750edae5@SystematicSw.ab.ca>
X-UI-Out-Filterresults: notjunk:1;V01:K0:vx59sAEgMe4=:lx2quRC3zD00V78wLQ+3rQ bol5YvKZkJvOKAEwM9n9fLrWZhhhTH7N+1h0mdsL7FYI6J8Y4fCQrflGJ8q0Q2ojkggYYHZHZ BbdaNQbk7iubOA6+HmzcB0XZ2d/KdoJfFkAdpuFRJlNX+lAfi43Pn8zR6MEHWHLO8C5zY+XrQ h3X/PSY7d98kK2ocWMpGQD0RE/AGu++qtmzJHvhBCm8Z6QiHewq+4VtAKcR1luc5y51fizr+3 n/pnxeYp9EW69EDG78LK1vBxzkYNt4xhS0XQwDc7E08CP+A8vLfaQ3Kt8yzIKjLypH1p3no4z nl5+gx502kWz3ZcJmy8v2Ab8YBBhllPMFPQsdGgkQ8qxEy7GaXP0jjDqTRIieXbKlnqAJkCTC UR2jQhISrAsW6Pu526cMdJ4ujIA+YJWk+jF6ZfZ7FsNEFnhd+nyllls+FXrCs3Agqj006Mm1C 5utze0v6vnVMDG5elBeiWK7tMhqniRUGrWQ4oWEt7SiKWJJwzPOS+vzKIJyKqg1vLVcU0GQjn szvXEHtFwLew607qYB5WI54XobBtXu2EwoqNuRhpcmBq8wXGozzr/rQ//FKMqFwrw/LTftQu7 E4+KYzzXCzcHFIuVHmhL1lxmzOo/iktGRpcHw/Zqx8lX40kTeBdXEvtK3veRHAAvjnkWAZZd2 2wnuVHo4rvYNuoEz9p8xkCUX+Cb2E/eF9mYj4xLbEHzgtoPopp6xv9AGtWpx9I6+Hq5W4/Dvc IR1TWKysy/nPQErkYbWTIkesmct5njTmprEZMPTwQGuEtTaLFheWIl6Vtyk=
X-IsSubscribed: yes

Am 05.08.2017 um 22:24 schrieb Brian Inglis:
> On 2017-08-05 13:06, Thomas Wolff wrote:
> ...
>> Which other platforms do actually use newlib?
> Many historical uPs and current uCs used in embedded systems supporting gcc not
> using Linux, including RTEMS, devKits for Nintendo and Sony game systems, aome
> Android, Google NaCl.
Do they all handle wchar_t to be encoded locale-specifically? I doubt that.
https://www.gnu.org/software/libunistring/manual/html_node/The-wchar_005ft-mess.html
particularly points out Solaris and FreeBSD, no others.

>>>> Issue 3 is the special conversion jp2uc which seems to be half-bred;
>>>> there is no such handling for Chinese or Korean.
>>> This shouldn't matter to you, just keep it in place. It's a historical, low
>>> footprint conversion for japanese characters without pulling in the unicode
>>> stuff. Not used on Cygwin so just ignore.
>> I had noticed meanwhile that this is not active in Cygwin, but it's broken
>> anyway for multiple reasons:
>> * platforms for which wchar_t is not Unicode should be explicitly listed
>> * if used, the transformation needs to be applied to all non-Unicode locales
>> (also Chinese, Korean, and even 8-bit locales such as *.CP1252)
>> * for towupper and towlower, the result must be back-transformed into the
>> respective locale encoding
>> * particulary the locale-specific _l functions inconsistently do not use the
>> transformation but have this note:
>>> We're using a locale-independent representation of upper/lower case based
>>> on Unicode data. Thus, the locale doesn't matter.
>> So I'd suggest to drop that stuff unless someone would like to fix it.
> Looks like JIS support is under newlib/iconvdata
So maybe the conversion can call jisx0201_to_ucs4 etc. from there, and 
also the back-conversion for towupper/lower is available.
But then the stuff is still broken for the other reasons. I could map 
the _l functions properly, if that's really desired, but how to handle 
other encodings and on which platforms?

Thomas

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019