delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/05/12/09:54:46

X-Recipient: archive-cygwin AT delorie DOT com
X-Spam-Check-By: sourceware.org
Date: Tue, 12 May 2009 15:54:24 +0200
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Cygwin programs doesn't support non-ASCII filenames
Message-ID: <20090512135424.GT21324@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <gu2u4o$f2i$3 AT ger DOT gmane DOT org> <20090509100231 DOT GR21324 AT calimero DOT vinschen DOT de> <gu46gf$3tf$1 AT ger DOT gmane DOT org>
MIME-Version: 1.0
In-Reply-To: <gu46gf$3tf$1@ger.gmane.org>
User-Agent: Mutt/1.5.19 (2009-02-20)
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On May  9 23:12, Lenik wrote:
> (This mail is encoded in utf-8)
>[...]
> The two chinese characters encoding in:
> GB2312: d7 c0 c3 e6
> UTF-8: e6 a1 8c e9 9d a2
> Unicode: \u684c \u9762
>[...]
> This is a new test don't use cygpath:
>     C:\Profiles\Shecti> set LANG=& bash -c "cat ??????"
>     cat: ??????: No such file or directory

I'm just looking into this issue and I do not quite understand how you
came up with the filename in this example.  Above you mention that the
mail is in UTF-8.  However, when I look into this email using `od -t
x1', the multibyte sequence in your example is e4 bd a0 e5 a5 bd, rather
than the aforementioned UTF-8 sequence e6 a1 8c e9 9d a2.  Nor does it
match the aforementioned GB2312 sequence d7 c0 c3 e6.  Can you please
explain how the multibyte sequence in the example is related to the
above GB2312 and UTF-8 sequences?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019