X-Recipient: archive-cygwin AT delorie DOT com X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4977C3857001 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=towo.net Authentication-Results: sourceware.org; spf=none smtp.mailfrom=towo AT towo DOT net From: Thomas Wolff Subject: drop ambiguous-wide behaviour from Unicode CJK locales To: cygwin AT cygwin DOT com X-Tagtoolbar-Keys: D20201010192051148 Message-ID: <036bb759-7d05-b31c-d77a-2dea5e51a3ba@towo.net> Date: Sat, 10 Oct 2020 19:20:51 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 MIME-Version: 1.0 X-Provags-ID: V03:K1:fidG8zlNmcimeS8VZSY7h1EX/2FhXJZbJImoFBWKK6kqcjz/+Nx 9qAHSPZgqeVoN3GXUgvYOUk8towGHIifGqdKXmcVTiX57bEAc3HKApEDqaXj63xa86n67qC 8pcYjeAOfhIi3JdMsaVtEriscXACPq0sFfpfjlwOgI2ZkfTWysCqtePSMOSR/V6H75kvPH1 kwa+FpG2IiQWio/ZRBk8A== X-UI-Out-Filterresults: notjunk:1;V03:K0:I032kcP11iI=:5df/h2wSVERDEnLPior3lx lL7AF0fxA7pOqrkruipran1ZEZ55tqkHV8vfW7ycL68K3WY40wi50Abi2TXCpd6xWLJu4tU0q 1aKM/+nMOvhYGRmn9oCNZxHhtSJ/BVrLUMk3GbeCW5++DMe9apXiibl4TV1ob2H9xHPrbh9A1 exTQLwfBIoUETv9W23HaAWdbYNNsbIHgZnwJzcNwTDzFi5hRZgf27pNKjvHWr6KhtmEsBJnkT m8q/c7LqygU3X9eZM50Z+9VkCNfSVfK5+ySLjSGIeXjR3YKfgZYERBrNm0KpmLxpWHDJj1H+5 fbIDyYI8+34X+8PAinb8H26jXKutTY/WdlShyoIcGg/8gFcj8rhJsbsyuBVdkpIo19zvOPffu yLRrnX2HmFBwdHcL0Mb37LLMLaBzxg4R1vyjfMSwP1Nw3uUGrTC+/50RJeEWjwlfnEv9jwqFb lsGLeIlz+oHLkR00wwnmgWTlJApsq0YKXFQco4n3Z5SOR/fECx32pzGYwpiOT3Yh80eFwYK/+ tc+AKwY+yYbbziJtbdiLaxiWlNVaiO/iIJZKWKeqoYT/oArteUPVmce9R4k+wnbwc/217bslH v+Fe3uxyNRpmPO1dsPjduZfMAK8v7346Y1I+Pg4sSZGlt6x1RjKLjEWcRc84I1t5iSf6PzhlW QxZ/5tnPx4KekV/rusxtUVPSx3hw+WkhFNNyFhYjORJXaspl+1kcEw4ql8RE45M8w/ZXn1Hfd XaIivlVi/hNJ8XZpNn0s4oS0YqxeATyqPRRB5zzQ7WnvN7DjnBNfP0Y7BqZKX1wj2Ghw1I0/u qI7o6ERPoOMqmAIgyTstXVR5PuwZSbvAM7NebK1F3nFMWAwzScaB21RiuwlleXobmEV5iv7 X-Spam-Status: No, score=-5.9 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.29 List-Id: General Cygwin discussions and problem reports List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8"; Format="flowed" Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 09AHLPaZ021747 It seems that ambiguous-wide behaviour (i.e. double-width property for characters in the East Asian Ambiguous width category) for CJK locales with UTF-8 encoding is inconsistent with Linux locale definitions. I've sent a patch to the newlib list that changes that. Characters like ─ ü æ are no longer wide in the following locales: ja_JP.utf8 ko_KR.utf8 zh_*.utf8 but only in ja, ko, zh locales with legacy encoding. Explicit modifiers @cjkwide and @cjknarrow are not affected. Thomas -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple