X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Tue, 12 May 2009 15:54:24 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: Cygwin programs doesn't support non-ASCII filenames Message-ID: <20090512135424.GT21324@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: <20090509100231 DOT GR21324 AT calimero DOT vinschen DOT de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.19 (2009-02-20) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On May 9 23:12, Lenik wrote: > (This mail is encoded in utf-8) >[...] > The two chinese characters encoding in: > GB2312: d7 c0 c3 e6 > UTF-8: e6 a1 8c e9 9d a2 > Unicode: \u684c \u9762 >[...] > This is a new test don't use cygpath: > C:\Profiles\Shecti> set LANG=& bash -c "cat ??????" > cat: ??????: No such file or directory I'm just looking into this issue and I do not quite understand how you came up with the filename in this example. Above you mention that the mail is in UTF-8. However, when I look into this email using `od -t x1', the multibyte sequence in your example is e4 bd a0 e5 a5 bd, rather than the aforementioned UTF-8 sequence e6 a1 8c e9 9d a2. Nor does it match the aforementioned GB2312 sequence d7 c0 c3 e6. Can you please explain how the multibyte sequence in the example is related to the above GB2312 and UTF-8 sequences? Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/