delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/05/12/11:10:19

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-0.5 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,RCVD_NUMERIC_HELO,SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: sourceware.org
To: cygwin AT cygwin DOT com
From: Lenik <lenik AT bodz DOT net>
Subject: Re: Cygwin programs doesn't support non-ASCII filenames
Date: Tue, 12 May 2009 23:07:47 +0800
Lines: 33
Message-ID: <guc3cc$8kf$1@ger.gmane.org>
References: <gu2u4o$f2i$3 AT ger DOT gmane DOT org> <20090509100231 DOT GR21324 AT calimero DOT vinschen DOT de> <gu46gf$3tf$1 AT ger DOT gmane DOT org> <20090512135424 DOT GT21324 AT calimero DOT vinschen DOT de>
Mime-Version: 1.0
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre) Gecko/20090223 Thunderbird/3.0b2
In-Reply-To: <20090512135424.GT21324@calimero.vinschen.de>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Note-from-DJ: This may be spam

On 2009-5-12 21:54, Corinna Vinschen wrote:
> On May  9 23:12, Lenik wrote:
>> (This mail is encoded in utf-8)
>> [...]
>> The two chinese characters encoding in:
>> GB2312: d7 c0 c3 e6
>> UTF-8: e6 a1 8c e9 9d a2
>> Unicode: \u684c \u9762
>> [...]
>> This is a new test don't use cygpath:
>>      C:\Profiles\Shecti>  set LANG=&  bash -c "cat ??????"
>>      cat: ??????: No such file or directory
>
> I'm just looking into this issue and I do not quite understand how you
> came up with the filename in this example.  Above you mention that the
> mail is in UTF-8.  However, when I look into this email using `od -t
> x1', the multibyte sequence in your example is e4 bd a0 e5 a5 bd, rather
> than the aforementioned UTF-8 sequence e6 a1 8c e9 9d a2.  Nor does it
> match the aforementioned GB2312 sequence d7 c0 c3 e6.  Can you please
> explain how the multibyte sequence in the example is related to the
> above GB2312 and UTF-8 sequences?
>
>
> Corinna
>
Sorry, there are two examples, the first using 桌面, and the second 
using 你好. You may test either.

桌面:e6 a1 8c e9 9d a2, GB2312=d7 c0 c3 e6
你好:e4 bd a0 e5 a5 bd, GB2312=c4 e3 ba c3

Thanks,
Lenik


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019