X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00,SPF_PASS X-Spam-Check-By: sourceware.org To: cygwin AT cygwin DOT com Subject: Re: Japanese/Chinese language question References: <20100121134055 DOT GE2402 AT calimero DOT vinschen DOT de> From: Kazuhiro Fujieda Date: Sat, 30 Jan 2010 05:00:24 +0900 In-Reply-To: <20100121134055.GE2402@calimero.vinschen.de> (Corinna Vinschen's message of "Thu\, 21 Jan 2010 14\:40\:55 +0100") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (windows-nt) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com >>> On Thu, 21 Jan 2010 14:40:55 +0100 >>> Corinna Vinschen said: > When comparing strings linguistically (strcoll/wcscoll), > > - are Hiragana and Katakana forms of the same character to be > treated as equal or as different? They should be treated as different. > - are half-width and full-width forms of the same CJK character > treated as equal or as different? Different, too. It is difficult to implement the collation algorithm from scratch. I recommend to use LCMapString to generate sort keys. -- Kazuhiro Fujieda -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple