delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2020/10/10/13:21:44

X-Recipient: archive-cygwin AT delorie DOT com
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4977C3857001
Authentication-Results: sourceware.org;
dmarc=none (p=none dis=none) header.from=towo.net
Authentication-Results: sourceware.org; spf=none smtp.mailfrom=towo AT towo DOT net
From: Thomas Wolff <towo AT towo DOT net>
Subject: drop ambiguous-wide behaviour from Unicode CJK locales
To: cygwin AT cygwin DOT com
X-Tagtoolbar-Keys: D20201010192051148
Message-ID: <036bb759-7d05-b31c-d77a-2dea5e51a3ba@towo.net>
Date: Sat, 10 Oct 2020 19:20:51 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101
Thunderbird/68.12.1
MIME-Version: 1.0
X-Provags-ID: V03:K1:fidG8zlNmcimeS8VZSY7h1EX/2FhXJZbJImoFBWKK6kqcjz/+Nx
9qAHSPZgqeVoN3GXUgvYOUk8towGHIifGqdKXmcVTiX57bEAc3HKApEDqaXj63xa86n67qC
8pcYjeAOfhIi3JdMsaVtEriscXACPq0sFfpfjlwOgI2ZkfTWysCqtePSMOSR/V6H75kvPH1
kwa+FpG2IiQWio/ZRBk8A==
X-UI-Out-Filterresults: notjunk:1;V03:K0:I032kcP11iI=:5df/h2wSVERDEnLPior3lx
lL7AF0fxA7pOqrkruipran1ZEZ55tqkHV8vfW7ycL68K3WY40wi50Abi2TXCpd6xWLJu4tU0q
1aKM/+nMOvhYGRmn9oCNZxHhtSJ/BVrLUMk3GbeCW5++DMe9apXiibl4TV1ob2H9xHPrbh9A1
exTQLwfBIoUETv9W23HaAWdbYNNsbIHgZnwJzcNwTDzFi5hRZgf27pNKjvHWr6KhtmEsBJnkT
m8q/c7LqygU3X9eZM50Z+9VkCNfSVfK5+ySLjSGIeXjR3YKfgZYERBrNm0KpmLxpWHDJj1H+5
fbIDyYI8+34X+8PAinb8H26jXKutTY/WdlShyoIcGg/8gFcj8rhJsbsyuBVdkpIo19zvOPffu
yLRrnX2HmFBwdHcL0Mb37LLMLaBzxg4R1vyjfMSwP1Nw3uUGrTC+/50RJeEWjwlfnEv9jwqFb
lsGLeIlz+oHLkR00wwnmgWTlJApsq0YKXFQco4n3Z5SOR/fECx32pzGYwpiOT3Yh80eFwYK/+
tc+AKwY+yYbbziJtbdiLaxiWlNVaiO/iIJZKWKeqoYT/oArteUPVmce9R4k+wnbwc/217bslH
v+Fe3uxyNRpmPO1dsPjduZfMAK8v7346Y1I+Pg4sSZGlt6x1RjKLjEWcRc84I1t5iSf6PzhlW
QxZ/5tnPx4KekV/rusxtUVPSx3hw+WkhFNNyFhYjORJXaspl+1kcEw4ql8RE45M8w/ZXn1Hfd
XaIivlVi/hNJ8XZpNn0s4oS0YqxeATyqPRRB5zzQ7WnvN7DjnBNfP0Y7BqZKX1wj2Ghw1I0/u
qI7o6ERPoOMqmAIgyTstXVR5PuwZSbvAM7NebK1F3nFMWAwzScaB21RiuwlleXobmEV5iv7
X-Spam-Status: No, score=-5.9 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,
SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
Sender: "Cygwin" <cygwin-bounces AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 09AHLPaZ021747

It seems that ambiguous-wide behaviour (i.e. double-width property for 
characters in the East Asian Ambiguous width category) for CJK locales 
with UTF-8 encoding is inconsistent with Linux locale definitions.
I've sent a patch to the newlib list that changes that. Characters like 
─ ü æ are no longer wide in the following locales:
ja_JP.utf8
ko_KR.utf8
zh_*.utf8
but only in ja, ko, zh locales with legacy encoding. Explicit modifiers 
@cjkwide and @cjknarrow are not affected.
Thomas




--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019