Mail Archives: cygwin/2010/02/19/23:01:20
On Fri, 19 Feb 2010 20:29:27 +0000, Andy Koppe <andy DOT koppe AT gmail DOT com>
wrote:
>Looks like there's some sort of GBK vs UTF-8 mixup going on, because
>'鏂版煡鏂囩尞' is the same byte sequence in GBK as '新查文献' is in UTF-8:
>\xE6\x96\xB0\xE6\x9F\xA5\xE6\x96\x87\xE7\x8C\xAE
Wonderful analysis! Could you please give me some hints on the tools
used by you to obtain this conclusion?
>
>I take it the actual directory name is '新查文献'? (Babelfish seems to be
>able to make some sense of that one but not the other.)
Yes, you're right. The actual directory name is '新查文献'.
>
>Do you know what the encoding of your batch file is?
GB2312
> And have you got
>any locale variables (LC_ALL, LC_CTYPE, LANG) set when invoking it?
I'use the following settings in the same batch file:
set LC_ALL=zh_CN.UTF-8
set LC_CTYPE="zh_CN.UTF-8"
set LANG=zh_CN.UTF-8
>
>
>>>@echo off
>>>C:\cygwin\bin\bash --login "%~dp0myscript"
>>
>> I've found a more strange thing: If I change the batch file into the
>> following form, then it will be run smoothly:
>>
>> @echo off
>> C:\cygwin\bin\bash --login %~dp0myscript
>>
>> The QUOTATION MARK in the former is used to deal with the whitespaces
>> appearing in the myscript's pathname, though this is relatively rare
>> case. ?But in the latter case, if there're whitespaces in the
>> myscript's pathname, the batch will fail to run.
>
>Hmm, perhaps the argument mangling at program startup is using the
>ANSI codepage (i.e. GBK in this case) when it should be using UTF-8?
But, if I convert my batch file into UTF-8 (without BOM, CR/LF line
endings) format, I'll meet the following error:
/usr/bin/bash:
"F:/zhaohs/Desktop/鏂版煡鏂囩尞/RestoreName4Elsevier.sh": No such
file or directory
--
.: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
- Raw text -