Mailing-List: contact cygwin-developers-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-developers-owner AT cygwin DOT com Delivered-To: mailing list cygwin-developers AT cygwin DOT com X-WM-Posted-At: avacado.atomice.net; Wed, 3 Jul 02 12:30:04 +0100 Message-ID: <00ca01c22284$f9ce27a0$0100a8c0@advent02> From: "Chris January" To: References: <200207031122 DOT g63BM3lL015113 AT burner DOT fokus DOT gmd DOT de> Subject: Re: UTF8 support in Cygwin Date: Wed, 3 Jul 2002 12:30:03 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 > >I am working on a patch which would add UTF8 support to Cygwin. > >i.e. Unicode filenames would be encoded as UTF8 before being returned by, > >e.g., readdir and then converted back to Unicode before being passed to the > >Windows API. > >This would solve Ville Herva's problem where he/she wanted to back up a > >filesystem containing Unicode filenames using Cygwin, but found that the > >Unicode characters were converted to question marks. Also, with an > >appropriate terminal, it is actually possible to view the Unicode characters > >(altough at the moment, it is not possible to input them correctly AFAIK). > >The code is currently guarded by a CYGWIN environment variable flag, 'utf8'. > > A long awaited feature! > > This would really help for "star" and "mkisofs". > > Star needs to archive UTF-8 coded names in the POSIX.1-2001 filenames > and "mkisofs" needs to deal with UNICODE names in Joliet and UDF. > > How about using the LC_* locale setup to force UTF-8 coding? The utf8 flag turns on conversion of Windows filenames from Unicode to UTF-8 and back again. This is completely unrelated to the LC_* stuff. You could use utf8 filenames without the utf8 flag and they would just be stored as utf8 on the Windows filesystem, but your application would see exactly the same filename. The only reason to use this flag is if you wish to access existing files which have Unicode filenames under Windows NT. For the record, I've changed all the MultiByteToWideChar, etc.'s to sys_utf8towcs now, to make it more consistent with the other conversion functions. Chris