X-Recipient: archive-cygwin@delorie.com
X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 	tests=BAYES_00,SPF_PASS
X-Spam-Check-By: sourceware.org
To: cygwin@cygwin.com
Subject: Re: Japanese/Chinese language question
References: <20100121134055.GE2402@calimero.vinschen.de>
From: Kazuhiro Fujieda <fujieda@acm.org>
Date: Sat, 30 Jan 2010 05:00:24 +0900
In-Reply-To: <20100121134055.GE2402@calimero.vinschen.de> (Corinna Vinschen's message of "Thu\, 21 Jan 2010 14\:40\:55 +0100")
Message-ID: <uiqak1y1z.fsf@acm.org>
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.3 (windows-nt)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
Precedence: bulk
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie.com@cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com

>>> On Thu, 21 Jan 2010 14:40:55 +0100
>>> Corinna Vinschen said:

> When comparing strings linguistically (strcoll/wcscoll),
>
> - are Hiragana and Katakana forms of the same character to be
>   treated as equal or as different?

They should be treated as different.

> - are half-width and full-width forms of the same CJK character
>   treated as equal or as different?

Different, too.

It is difficult to implement the collation algorithm from
scratch. I recommend to use LCMapString to generate sort keys.
-- 
Kazuhiro Fujieda

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

