delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2006/12/02/18:11:53

X-Spam-Check-By: sourceware.org
Message-ID: <45720808.5040409@tlinx.org>
Date: Sat, 02 Dec 2006 15:11:04 -0800
From: Linda Walsh <cygwin AT tlinx DOT org>
User-Agent: Thunderbird 1.5.0.8 (Windows/20061025)
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: Windows NTFS UCS2 characters
References: <C1946630.19DC4%eljay AT adobe DOT com> <456F0E89 DOT 28B2E427 AT dessent DOT net> <Pine DOT GSO DOT 4 DOT 63 DOT 0611301226090 DOT 10187 AT access1 DOT cims DOT nyu DOT edu>
In-Reply-To: <Pine.GSO.4.63.0611301226090.10187@access1.cims.nyu.edu>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

Igor Peshansky wrote:
> The former is true, the latter is half-true. Cygwin works with the
> default codepage when the Windows locale settings are set correctly.  You
> cannot *switch* locales programmatically from within Cygwin, but it can
> handle the full 8-bit charset just fine.
>
> Not sure what ANSI means in this context (if you meant ASCII, or 7-bit,
> then the codepage reference makes no sense).  If the codepage is set
> correctly, Cygwin will read those files.
>   
---
    I wish the problem was so simple.  But files created in Windows
aren't created under any _one_ codepage.  Most of my files are fine
to read under cp850/437 (or iso8859-1 equiv), but not all of them.

   A few files -- in a most annoying section use characters not
supported in the western/latin-1 charset.  It's in a Music folder
containing world music.  I'd like to be able to use "rsync" to
copy the music to my MP3 device, but two different code pages
would be required -- some files have French names that encode
under an iso8859-1 equivalent codepage, but music in an adjacent
directory is Middle Eastern.  That requires some different, Turkish
codepage.

    So you see, there is no [single] codepage that will work to
copy (or read) the files in Cygwin.

    That's the main reason proper UTF-8 support is a "want" of mine.
It works on linux where the files are stored on a server, and windows
reads them, but Cygwin is limited to Win98-level support.  :-(

    Aren't most of the libraries used on cygwin the same as those used
on linux?  If UTF-8 support has been added there, I'm not sure why it
is so difficult on cygwin.  Is it a limitation of the underlying OS
calls that would have to be worked around? 

    Oh well...
-linda


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019