X-Spam-Check-By: sourceware.org Message-ID: <45720808.5040409@tlinx.org> Date: Sat, 02 Dec 2006 15:11:04 -0800 From: Linda Walsh User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: Windows NTFS UCS2 characters References: <456F0E89 DOT 28B2E427 AT dessent DOT net> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Igor Peshansky wrote: > The former is true, the latter is half-true. Cygwin works with the > default codepage when the Windows locale settings are set correctly. You > cannot *switch* locales programmatically from within Cygwin, but it can > handle the full 8-bit charset just fine. > > Not sure what ANSI means in this context (if you meant ASCII, or 7-bit, > then the codepage reference makes no sense). If the codepage is set > correctly, Cygwin will read those files. > --- I wish the problem was so simple. But files created in Windows aren't created under any _one_ codepage. Most of my files are fine to read under cp850/437 (or iso8859-1 equiv), but not all of them. A few files -- in a most annoying section use characters not supported in the western/latin-1 charset. It's in a Music folder containing world music. I'd like to be able to use "rsync" to copy the music to my MP3 device, but two different code pages would be required -- some files have French names that encode under an iso8859-1 equivalent codepage, but music in an adjacent directory is Middle Eastern. That requires some different, Turkish codepage. So you see, there is no [single] codepage that will work to copy (or read) the files in Cygwin. That's the main reason proper UTF-8 support is a "want" of mine. It works on linux where the files are stored on a server, and windows reads them, but Cygwin is limited to Win98-level support. :-( Aren't most of the libraries used on cygwin the same as those used on linux? If UTF-8 support has been added there, I'm not sure why it is so difficult on cygwin. Is it a limitation of the underlying OS calls that would have to be worked around? Oh well... -linda -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/