X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=IY7vadGWwDX5puyP Tvz3fxykiPsXKhKVFYd2foqQKMfZqVCzrRi1RPGOc8Hkt2cmxCJt5PfSCBjtrX7L p02AcRHKoQNtiODdb2ou40dZF6eOEi9E1HgrK7fbujXlSKS/V9lkW+VN8BR6flq9 TYff6UzKnleZqpCYI8SOSse//b4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=VctyBajCCjfEck9+17jY7W /0jTs=; b=vg355cZOCpIRbc2Z4VYirLJDR7Cv1HieLZB/Ay2rWwrRj2P3Tq0WFE Xesr2a/6pxqxo4YHFjvAFquRrZ45Zlghvvox1yCD+Gr0FR3ydtaBOYKa065+uMd9 tv7MPO852JyQUzf+x1sA9ODxmW2s6CKtdYT6Ct4EndnGIzh1EtzYo= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.2 spammy=highly X-HELO: mout.kundenserver.de Subject: Re: Unicode width data inconsistent/outdated To: cygwin AT cygwin DOT com References: <20170726080859 DOT GA24312 AT calimero DOT vinschen DOT de> <5d3cb047-49f8-26a6-d816-387a71486e99 AT cygwin DOT com> <20170726095016 DOT GA25666 AT calimero DOT vinschen DOT de> From: Thomas Wolff Message-ID: <289bd98b-e644-888d-07f8-8965b6538373@towo.net> Date: Wed, 26 Jul 2017 23:43:43 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170726095016.GA25666@calimero.vinschen.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-UI-Out-Filterresults: notjunk:1;V01:K0:9FU2iUmiwPk=:S4vCgczPQUnLitY/vjdxlI oubm7yq+6D6NU1PzJOAhOZNCpcSc5HLRWa9fgc99qknViYY0RGr4ug/J2LU55aq6hpUEig1hT UYoxoiQtFx1HHHA2/kU6Wsb9deSXpIA30uH29f7ASmAJNfX5zHgBzBdx8NOOPnVtCqGq53wC7 E03GfZ+nAJYu75+3PYX2N/PSqK5mJ5u99oVKvJgll/wQG2SrY+sTqaAh8qzHHXHbhSFaXmQzG zuY06iWNXkzyrgNFyvO+sZpg+c1OD/DXEdQQO9REg66W3LDqKe15lf2dPbr/wfBV/FXBifMv/ /rDhVlCPP2Hfg8IYN6WB+h5rcmbhNqVbHihEE9LuQ7bcqyFn/COfhKLHJbXR9NWQGTb8tCjgr 5OLU98dcRJ5nDHkinTnKGI1WwexwuODEZ+bV4BC9LiOP8HTjr8S8WIQodOD/9XSA216KLd1vJ O1bp0zwTTVWITNtdG4CJ1ZXPxjHJ9AYk2BUa0A/iSKN8Y9rW8W5mPwUFkeeWpfXFDTOspBPTD nVDURO51Gl86QSiBZLGwXb8iv66IHXYO5VFJW27tpFlRIc440hXpHC0zEWT+vW9qUL9e8PtJO LMxs9WUq9VoQoPdn/DD2B0pujrHctrsOgztPoc6RnOpcr0Ji65e+xgco9o1u4410vvWqsCQ6P KoVVzHqz/l2jAC0HsvI+NFzBn7iTkiKPPKLBK5Vdv2QxZbq9KLfBpIUqiv0vuLZEuCll3aaYc LfxMs3NZactvFZIjBD5cdvs+bVy0cEVBQ0VxGHknY9yss18P8Ukjl3sY/D0= X-IsSubscribed: yes Am 26.07.2017 um 11:50 schrieb Corinna Vinschen: > On Jul 26 03:16, Yaakov Selkowitz wrote: >> On 2017-07-26 03:08, Corinna Vinschen wrote: >>> On Jul 26 08:49, Thomas Wolff wrote: >>>> It would be good to keep wcwidth/wcswidth in sync with the installed >>>> Unicode data version (package unicode-ucd). >>>> Currently it seems to be hard-coded (in newlib/libc/string/wcwidth.c); >>>> it refers to Unicode 5.0 while installed Unicode data suggest 9.0 would >>>> be used. >>>> I can provide some scripts to generate the respective tables if desired. >>>> Thomas >>> If you can update the newlib files this way and send matching patches >>> to the newlib list, this would be highly appreciated. >> Thomas, I just updated unicode-ucd to 10.0 for this purpose. Thanks. > > Oh, and, btw, the comment in wcwidth.c isn't quite correct. The > cwstate in newlib is on Unicode 5.2, see newlib/libc/ctype/towupper.c. Oh, a number of other embedded tables. To make the tow* and isw* functions more easily adaptable to Unicode updates, there will be some revisions to do here. And the to* and is* ones (without 'w') even refer to locales in a way I do not understand. Maybe I'll restrict my effort to wcwidth first... Thomas -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple