delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2010/02/19/23:01:20

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-0.3 required=5.0 tests=AWL,BAYES_00,RCVD_NUMERIC_HELO,SPF_HELO_PASS,SPF_PASS
X-Spam-Check-By: sourceware.org
To: cygwin AT cygwin DOT com
From: Hongyi Zhao <hongyi DOT zhao AT gmail DOT com>
Subject: Re: 1.7.1: unable to run the a bash script resides in chinese path using: c:\cygwin\bin\bash --login script.
Date: Sat, 20 Feb 2010 12:00:37 +0800
Lines: 57
Message-ID: <drmun5969k15jlm1ji2auh5cojrnakc6uu@4ax.com>
References: <t94sn59ntooeal9hc0a25hkk7ntphg99cf AT 4ax DOT com> <c6fsn5ln6bdtgr86bp3ri44ui48kf57ica AT 4ax DOT com> <416096c61002191229x670cbb63gf5c693056af727a2 AT mail DOT gmail DOT com>
Mime-Version: 1.0
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Note-from-DJ: This may be spam

On Fri, 19 Feb 2010 20:29:27 +0000, Andy Koppe <andy DOT koppe AT gmail DOT com>
wrote:

>Looks like there's some sort of GBK vs UTF-8 mixup going on, because
>'鏂版煡鏂囩尞' is the same byte sequence in GBK as '新查文献' is in UTF-8:
>\xE6\x96\xB0\xE6\x9F\xA5\xE6\x96\x87\xE7\x8C\xAE

Wonderful analysis!  Could you please give me some hints on the tools
used by you to obtain this conclusion?

>
>I take it the actual directory name is '新查文献'? (Babelfish seems to be
>able to make some sense of that one but not the other.)

Yes, you're right.  The actual directory name is '新查文献'.

>
>Do you know what the encoding of your batch file is?

GB2312

> And have you got
>any locale variables (LC_ALL, LC_CTYPE, LANG) set when invoking it?

I'use the following settings in the same batch file:

set LC_ALL=zh_CN.UTF-8
set LC_CTYPE="zh_CN.UTF-8"
set LANG=zh_CN.UTF-8

>
>
>>>@echo off
>>>C:\cygwin\bin\bash --login "%~dp0myscript"
>>
>> I've found a more strange thing: If I change the batch file into the
>> following form, then it will be run smoothly:
>>
>> @echo off
>> C:\cygwin\bin\bash --login %~dp0myscript
>>
>> The QUOTATION MARK in the former is used to deal with the whitespaces
>> appearing in the myscript's pathname, though this is relatively rare
>> case. ?But in the latter case, if there're whitespaces in the
>> myscript's pathname, the batch will fail to run.
>
>Hmm, perhaps the argument mangling at program startup is using the
>ANSI codepage (i.e. GBK in this case) when it should be using UTF-8?

But, if I convert my batch file into UTF-8 (without BOM, CR/LF line
endings) format, I'll meet the following error:

/usr/bin/bash:
"F:/zhaohs/Desktop/鏂版煡鏂囩尞/RestoreName4Elsevier.sh": No such
 file or directory
-- 
.: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019