X-Spam-Check-By: sourceware.org
Message-ID: <45720808.5040409@tlinx.org>
Date: Sat, 02 Dec 2006 15:11:04 -0800
From: Linda Walsh <cygwin@tlinx.org>
User-Agent: Thunderbird 1.5.0.8 (Windows/20061025)
MIME-Version: 1.0
To: cygwin@cygwin.com
Subject: Re: Windows NTFS UCS2 characters
References: <C1946630.19DC4%eljay@adobe.com> <456F0E89.28B2E427@dessent.net> <Pine.GSO.4.63.0611301226090.10187@access1.cims.nyu.edu>
In-Reply-To: <Pine.GSO.4.63.0611301226090.10187@access1.cims.nyu.edu>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com

Igor Peshansky wrote:
> The former is true, the latter is half-true. Cygwin works with the
> default codepage when the Windows locale settings are set correctly.  You
> cannot *switch* locales programmatically from within Cygwin, but it can
> handle the full 8-bit charset just fine.
>
> Not sure what ANSI means in this context (if you meant ASCII, or 7-bit,
> then the codepage reference makes no sense).  If the codepage is set
> correctly, Cygwin will read those files.
>   
---
    I wish the problem was so simple.  But files created in Windows
aren't created under any _one_ codepage.  Most of my files are fine
to read under cp850/437 (or iso8859-1 equiv), but not all of them.

   A few files -- in a most annoying section use characters not
supported in the western/latin-1 charset.  It's in a Music folder
containing world music.  I'd like to be able to use "rsync" to
copy the music to my MP3 device, but two different code pages
would be required -- some files have French names that encode
under an iso8859-1 equivalent codepage, but music in an adjacent
directory is Middle Eastern.  That requires some different, Turkish
codepage.

    So you see, there is no [single] codepage that will work to
copy (or read) the files in Cygwin.

    That's the main reason proper UTF-8 support is a "want" of mine.
It works on linux where the files are stored on a server, and windows
reads them, but Cygwin is limited to Win98-level support.  :-(

    Aren't most of the libraries used on cygwin the same as those used
on linux?  If UTF-8 support has been added there, I'm not sure why it
is so difficult on cygwin.  Is it a limitation of the underlying OS
calls that would have to be worked around? 

    Oh well...
-linda


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

