delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2005/03/31/22:17:55

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Message-ID: <424CBD43.3050602@byu.net>
Date: Thu, 31 Mar 2005 20:17:23 -0700
From: Eric Blake <ebb9 AT byu DOT net>
User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
MIME-Version: 1.0
To: Boris New <boris DOT new AT gmail DOT com>, cygwin AT cygwin DOT com, bug-coreutils AT gnu DOT org
Subject: Re: Probem with join and accentuated characters
References: <cf47199205033104307d2a4a84 AT mail DOT gmail DOT com> <424C0145 DOT 9070202 AT byu DOT net> <cf4719920503311254c7d48f2 AT mail DOT gmail DOT com>
In-Reply-To: <cf4719920503311254c7d48f2@mail.gmail.com>
X-IsSubscribed: yes

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Boris New on 3/31/2005 1:54 PM:
> Hi,
> 
> I send you the zip file with the two files. I tested a lot of windows
> port and all have this problem. I thought it was perhaps due to locale
> on windows.
> The format is the same and files are sorted. Everything is ok if I
> remove accentuated words from rand.txt.

Contrary to your assertion, your files were not sorted.  Or put another
way, they weren't sorted by the same rules that join expected.  There are
some locales that treat é and e as the same collating character, but the C
locale that is the default of cygwin is not one of them.  Hence, join gave
up after the first line where the sorting failed to match its expectations.

Run the following to show this:
$ sort < rand.txt > randsort.txt
$ diff rand.txt randsort.txt

Only if the diff turns up no change on both files will join work like you
want, for the locale you are using.

- --
Life is short - so eat dessert first!

Eric Blake             ebb9 AT byu DOT net

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCTL1D84KuGfSFAYARAud5AJ0ZzqemqItQ3oTcMiYqz08dtojsyQCeNe/O
n5V+udbdLBKFEbW/Qg8pOGU=
=pvms
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019