delorie.com/archives/browse.cgi | search |
X-Recipient: | archive-cygwin AT delorie DOT com |
X-Spam-Check-By: | sourceware.org |
Date: | Sat, 20 Feb 2010 10:17:10 +0100 |
From: | Corinna Vinschen <corinna-cygwin AT cygwin DOT com> |
To: | cygwin AT cygwin DOT com |
Subject: | Re: 1.7.1: unable to run the a bash script resides in chinese path using: c:\cygwin\bin\bash --login script. |
Message-ID: | <20100220091710.GI5683@calimero.vinschen.de> |
Reply-To: | cygwin AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
References: | <t94sn59ntooeal9hc0a25hkk7ntphg99cf AT 4ax DOT com> <c6fsn5ln6bdtgr86bp3ri44ui48kf57ica AT 4ax DOT com> <416096c61002191229x670cbb63gf5c693056af727a2 AT mail DOT gmail DOT com> <drmun5969k15jlm1ji2auh5cojrnakc6uu AT 4ax DOT com> <416096c61002200000r549264c4tfdf46a9b71700bc AT mail DOT gmail DOT com> |
MIME-Version: | 1.0 |
In-Reply-To: | <416096c61002200000r549264c4tfdf46a9b71700bc@mail.gmail.com> |
User-Agent: | Mutt/1.5.20 (2009-06-14) |
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
List-Id: | <cygwin.cygwin.com> |
List-Unsubscribe: | <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com> |
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
List-Archive: | <http://sourceware.org/ml/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
Sender: | cygwin-owner AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
Delivered-To: | mailing list cygwin AT cygwin DOT com |
On Feb 20 08:00, Andy Koppe wrote: > Hongyi Zhao: > >>Looks like there's some sort of GBK vs UTF-8 mixup going on, because > >>'??????????????????' is the same byte sequence in GBK as '????????????' is in UTF-8: > >>\xE6\x96\xB0\xE6\x9F\xA5\xE6\x96\x87\xE7\x8C\xAE > > > > Could you please give me some hints on the tools > > used by you to obtain this conclusion? > > That was just a hunch based on the length of the two strings, and I > confirmed it by pasting the strings into mintty running a utility for > echoing keycodes, switching charset as appropriate. > > Anyway, I had a look into why the dosfilewarning prints the wrong > filename: it calls small_sprintf to print the message, and > small_sprintf uses the ANSI version of WriteFile to write to > STD_ERROR_HANDLE, so it ends up interpreting a UTF-8 string as GBK. > Seems sys_mbstowcs and WriteFileW are needed there. There's no such thing as a WriteFileW function. Since that only affects a few error messages, I don't think it's overly important. The most simple approach here is to enforce hex printing of all characters > 0x7f as hex values as in: MS-DOS style path detected: \tmp\t\\xC3\xB6\xC3\xA4 Preferred POSIX equivalent is: /tmp/t/\xC3\xB6\xC3\xA4 [...] I've changed that in CVS. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |