X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Sat, 6 Jun 2009 11:31:02 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com, newlib AT sourceware DOT org Subject: Re: [Fwd: [1.7] wcwidth failing configure tests] Message-ID: <20090606093102.GJ23519@calimero.vinschen.de> Mail-Followup-To: cygwin AT cygwin DOT com, newlib AT sourceware DOT org References: <20090512165404 DOT GW21324 AT calimero DOT vinschen DOT de> <416096c60905120956n5521929bm69586f5e6325a994 AT mail DOT gmail DOT com> <20090512173153 DOT GY21324 AT calimero DOT vinschen DOT de> <3f0ad08d0905140858j17c7b374paa649f18ef18178d AT mail DOT gmail DOT com> <200905201652 DOT n4KGqYGm000509 AT mail DOT bln1 DOT bf DOT nsn-intra DOT net> <200906051625 DOT n55GP6t3028411 AT mail DOT bln1 DOT bf DOT nsn-intra DOT net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200906051625.n55GP6t3028411@mail.bln1.bf.nsn-intra.net> User-Agent: Mutt/1.5.19 (2009-02-20) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Jun 5 18:25, Thomas Wolff wrote: > IWAMURO Motonori wrote: > > 2009/5/21 Thomas Wolff : > > >> > Therefore, I propose to use *_cjk() when the language part of LC_CTYPE > > >> > is 'ja', 'ko', 'vi' or 'zh'. > > > The problem with this is > > > 1. As you say, there is no standard. > > > But, > > - I think that my proposal doesn't violate any specification. > I think it does. Part of the locale information is the "charmap" > (called "codepage" on DOS/Windows). It may be implicit like > with LC_CTYPE=zh_CN which defines "GB2312" as its charmap, but it > is typically explicit like in en_US.UTF-8 - the intention is > that the "codepage" information should be the same for all locales > having thbe "UTF-8" (or any other) charmap. So you cannot freely > change width information among locales with the same charmap. > Also, if ja_JP.UTF-8 would mean "CJK width", how would you specify > a working locale setting for a terminal that does not run a CJK width > font but should yet use other Japanese settings? E.g. with rxvt which > does not support CJK width. > > However, there is one resort within the locale mechanism that can be used; > the locale syntax allows for an optional "modifier" which can be used to > specify deviations, e.g. > de_DE has charmap ISO-8859-1 > de_DE AT euro has charmap ISO-8859-15 > uz_UZ has charmap ISO-8859-1 > uz_UZ AT cyrillic has charmap UTF-8 > aa_ER and aa_ER AT saaho both have charmap UTF-8 (with some other difference). > Thus you could define e.g. > ja_JP DOT UTF-8 AT cjk > or > ja_JP DOT UTF-8 AT cjkwidth > to indicate CJK width properties. I guess this is the most compliant way to go. I like this approach. It's also more flexible than using the language specifier. Thomas, couldn't you have discussed this in the two weeks I was on vacation? Why did you wait until I implemented the language-based approach? Now, we just have to agree on the modifier and somebody has to implement this in newlib/libc/locale/locale.c. So far the modifier is ignored entirely (de_DE AT euro will still use ISO-8859-1). I vote for @cjkwide, regardless of Andy's objection. People using CJK will know the meaning and it has the additional advantage to be a rather simple to memorize identifier. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/