Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com
Message-ID: <424C0145.9070202@byu.net>
Date: Thu, 31 Mar 2005 06:55:17 -0700
From: Eric Blake <ebb9@byu.net>
User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
MIME-Version: 1.0
To: Boris New <boris.new@gmail.com>
CC: cygwin@cygwin.com
Subject: Re: Probem with join and accentuated characters
References: <cf47199205033104307d2a4a84@mail.gmail.com>
In-Reply-To: <cf47199205033104307d2a4a84@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-IsSubscribed: yes

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Boris New on 3/31/2005 5:30 AM:
> Hi,
> 
> Join in coreutils 5.3.03 gives incomplete results when the two files
> include french accentuated characters. (for instance
> é|è|â|ï|ü|ê|ç|î|ô|û|ü|ë|à|ù) .
> Results are okay when I have only one text file with accentuated characters.

I'll need more details on what you think is broken (hint - two actual
short files that you tried to join, and the results you got vs what you
expected).  Also, coreutils-5.3.0-3 join is unmodified from upstream
sources, so you may want to ask this question on the upstream list
(bug-coreutils@gnu.org).  But it may have something to do with file
encodings; if your two inputs have different encodings, accented
characters don't necessarily have the same underlying bytes, and that
might mess up join.  Also, join requires both files to be sorted on the
join fields, and if they are not, there is no telling what results to expect.

- --
Life is short - so eat dessert first!

Eric Blake             ebb9@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCTAFF84KuGfSFAYARAjX0AKCy83MHsdGJFx0kvsexYBPV6CnR2QCgrbfV
IVM1USqaQS3U8bdr1vV0Kck=
=D383
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

