X-Recipient: archive-cygwin@delorie.com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:to:from:subject:message-id:date:mime-version
	:content-type:content-transfer-encoding; q=dns; s=default; b=AzX
	oUvpUYwiJ1CN9zJPPsR1XK7Jd4Ii6rHfFaFcbNi2NFQuFdJdROc1YWzQ0UPt6co2
	QU/fXQdiC9zX237ZxNJ6boisz8ZeVTcdrRE+wZXVSxhtQweI2vXXbsYbgXOcPRxh
	Wkp1kASoQ+nJkpADJS6AfccH2cELDNEJqjcvSo10=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:to:from:subject:message-id:date:mime-version
	:content-type:content-transfer-encoding; s=default; bh=sf4sc7BJ+
	Gciw7uO4sFc35TmDUs=; b=Iud//RQLQgvUBbYd9XMDm4TeTdwRl/zZmViYCOd1v
	qjQNCRJyKPr+kPR50eMuvqoSW29+wER8scBDqp4r9GyqUwTv/rcchze9bwSfK83i
	W8qvG14u2+b00TsW8e0IUT93xBZgS8QUv/DZFdvguYAvvlu5f3ykC/ybssVChGk8
	1I=
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2
X-HELO: limerock02.mail.cornell.edu
X-CornellRouted: This message has been Routed already.
To: cygwin <cygwin@cygwin.com>
From: Ken Brown <kbrown@cornell.edu>
Subject: Bug in collation functions?
Message-ID: <563148AF.1000502@cornell.edu>
Date: Wed, 28 Oct 2015 18:14:07 -0400
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes

It's my understanding that collation is supposed to take whitespace and 
punctuation into account in the POSIX locale but not in other locales. 
This doesn't seem to be the case on Cygwin.  Here's a test case using 
wcscoll, but the same problem occurs with strcoll.

$ cat wcscoll_test.c
#include <wchar.h>
#include <stdio.h>
#include <locale.h>

void
compare (const wchar_t *a, const wchar_t *b, const char *loc)
{
   setlocale (LC_COLLATE, loc);
   char res = wcscoll (a, b) < 0 ? '<' : '>';
   printf ("\"%ls\" %c \"%ls\" in %s locale\n", a, res, b, loc);
}

int
main ()
{
   compare (L"11", L"1.1", "POSIX");
   compare (L"11", L"1.1", "en_US.UTF-8");
   compare (L"11", L"1 2", "POSIX");
   compare (L"11", L"1 2", "en_US.UTF-8");
}

$ gcc wcscoll_test.c -o wcscoll_test

$ ./wcscoll_test
"11" > "1.1" in POSIX locale
"11" > "1.1" in en_US.UTF-8 locale
"11" > "1 2" in POSIX locale
"11" > "1 2" in en_US.UTF-8 locale

On Linux, the output from the same program is

"11" > "1.1" in POSIX locale
"11" < "1.1" in en_US.UTF-8 locale
"11" > "1 2" in POSIX locale
"11" < "1 2" in en_US.UTF-8 locale

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

