delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/05/12/16:09:20

X-Recipient: archive-cygwin AT delorie DOT com
X-Spam-Check-By: sourceware.org
Date: Tue, 12 May 2009 22:08:53 +0200
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8
Message-ID: <20090512200853.GA20162@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <3f0ad08d0905121029j119c8a7ep41d3a261d8bea338 AT mail DOT gmail DOT com> <20090512173741 DOT GZ21324 AT calimero DOT vinschen DOT de> <f60fe000905121213p25f89b71v50931fce588f38a AT mail DOT gmail DOT com> <20090512192253 DOT GB21324 AT calimero DOT vinschen DOT de> <f60fe000905121253j64347964p43e0644e798a8f29 AT mail DOT gmail DOT com>
MIME-Version: 1.0
In-Reply-To: <f60fe000905121253j64347964p43e0644e798a8f29@mail.gmail.com>
User-Agent: Mutt/1.5.19 (2009-02-20)
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On May 12 15:53, Mark J. Reed wrote:
> On Tue, May 12, 2009 at 3:22 PM, Corinna Vinschen
> >
> > http://cygwin.com/1.7/cygwin-ug-net/using-specialnames.html#pathnames-unusual
> 
> OK, got it.  So Mr. Iwamuro's proposal is that Cygwin ignore the
> locale setting, and just automatically convert the Windows UTF-16
> filenames to UTF-8 (and back) no matter what.

No.  Only if LANG=C.

> That seems rife with possible confusion, though. If I have my codepage
> set to ISO-2022 and paste in a filename, I expect it to be interpreted

Cygwin 1.7 doesn't use the codepage.  It uses what $LANG says.  See
http://cygwin.com/1.7/cygwin-ug-net/setup-locale.html

> as ISO-2022, not as UTF-8 (which will probably fail with an invalid
> encoding sequence).
> 
> OTOH, the SO/UTF-8 hack would seem to bode ill for the portability of,
> say, tar archives created under Cygwin.

The filenames potentially look weird, but they are valid filenames.
If anybody has a better idea how to workaround the problem of UTF-16
chars which don't translate into the current singlebyte or multibyte
charset, feel free to suggest.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019