X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-0.3 required=5.0 tests=AWL,BAYES_00,RCVD_NUMERIC_HELO,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org To: cygwin AT cygwin DOT com From: Hongyi Zhao Subject: Re: 1.7.1: unable to run the a bash script resides in chinese path using: c:\cygwin\bin\bash --login script. Date: Sat, 20 Feb 2010 12:00:37 +0800 Lines: 57 Message-ID: References: <416096c61002191229x670cbb63gf5c693056af727a2 AT mail DOT gmail DOT com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Note-from-DJ: This may be spam On Fri, 19 Feb 2010 20:29:27 +0000, Andy Koppe wrote: >Looks like there's some sort of GBK vs UTF-8 mixup going on, because >'鏂版煡鏂囩尞' is the same byte sequence in GBK as '新查文献' is in UTF-8: >\xE6\x96\xB0\xE6\x9F\xA5\xE6\x96\x87\xE7\x8C\xAE Wonderful analysis! Could you please give me some hints on the tools used by you to obtain this conclusion? > >I take it the actual directory name is '新查文献'? (Babelfish seems to be >able to make some sense of that one but not the other.) Yes, you're right. The actual directory name is '新查文献'. > >Do you know what the encoding of your batch file is? GB2312 > And have you got >any locale variables (LC_ALL, LC_CTYPE, LANG) set when invoking it? I'use the following settings in the same batch file: set LC_ALL=zh_CN.UTF-8 set LC_CTYPE="zh_CN.UTF-8" set LANG=zh_CN.UTF-8 > > >>>@echo off >>>C:\cygwin\bin\bash --login "%~dp0myscript" >> >> I've found a more strange thing: If I change the batch file into the >> following form, then it will be run smoothly: >> >> @echo off >> C:\cygwin\bin\bash --login %~dp0myscript >> >> The QUOTATION MARK in the former is used to deal with the whitespaces >> appearing in the myscript's pathname, though this is relatively rare >> case. ?But in the latter case, if there're whitespaces in the >> myscript's pathname, the batch will fail to run. > >Hmm, perhaps the argument mangling at program startup is using the >ANSI codepage (i.e. GBK in this case) when it should be using UTF-8? But, if I convert my batch file into UTF-8 (without BOM, CR/LF line endings) format, I'll meet the following error: /usr/bin/bash: "F:/zhaohs/Desktop/鏂版煡鏂囩尞/RestoreName4Elsevier.sh": No such file or directory -- .: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple