delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2020/10/06/22:20:57

X-Recipient: archive-cygwin AT delorie DOT com
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1B9AE3858D35
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
header.from=SystematicSw.ab.ca
Authentication-Results: sourceware.org;
spf=none smtp.mailfrom=brian DOT inglis AT systematicsw DOT ab DOT ca
X-Authority-Analysis: v=2.4 cv=bZHV7MDB c=1 sm=1 tr=0 ts=5f7d25d6
a=kiZT5GMN3KAWqtYcXc+/4Q==:117 a=kiZT5GMN3KAWqtYcXc+/4Q==:17
a=IkcTkHD0fZMA:10 a=Pdbar9f_tpPZj61_2-IA:9 a=QEXdDO2ut3YA:10
Subject: Re: Unconsistent command-line parsing in case of UTF-8 quoted
arguments
To: cygwin AT cygwin DOT com
References: <CAFC9CLCtfMORMxAK6==jdwY5ZbX6jWwo+JCfDwM3njgvGduf0w AT mail DOT gmail DOT com>
<634821436 DOT 20201004141809 AT yandex DOT ru>
<CAFC9CLCHk0WMj935OzZF+HeAdDbv-kGU_SHyi47vohagM+ZmtQ AT mail DOT gmail DOT com>
From: Brian Inglis <Brian DOT Inglis AT SystematicSw DOT ab DOT ca>
Autocrypt: addr=Brian DOT Inglis AT SystematicSw DOT ab DOT ca; prefer-encrypt=mutual;
keydata=
mDMEXopx8xYJKwYBBAHaRw8BAQdAnCK0qv/xwUCCZQoA9BHRYpstERrspfT0NkUWQVuoePa0
LkJyaWFuIEluZ2xpcyA8QnJpYW4uSW5nbGlzQFN5c3RlbWF0aWNTdy5hYi5jYT6IlgQTFggA
PhYhBMM5/lbU970GBS2bZB62lxu92I8YBQJeinHzAhsDBQkJZgGABQsJCAcCBhUKCQgLAgQW
AgMBAh4BAheAAAoJEB62lxu92I8Y0ioBAI8xrggNxziAVmr+Xm6nnyjoujMqWcq3oEhlYGAO
WacZAQDFtdDx2koSVSoOmfaOyRTbIWSf9/Cjai29060fsmdsDLg4BF6KcfMSCisGAQQBl1UB
BQEBB0Awv8kHI2PaEgViDqzbnoe8B9KMHoBZLS92HdC7ZPh8HQMBCAeIfgQYFggAJhYhBMM5
/lbU970GBS2bZB62lxu92I8YBQJeinHzAhsMBQkJZgGAAAoJEB62lxu92I8YZwUBAJw/74rF
IyaSsGI7ewCdCy88Lce/kdwX7zGwid+f8NZ3AQC/ezTFFi5obXnyMxZJN464nPXiggtT9gN5
RSyTY8X+AQ==
Organization: Systematic Software
Message-ID: <6d976fd5-74f9-a9e0-49a1-26993fc13b51@SystematicSw.ab.ca>
Date: Tue, 6 Oct 2020 20:20:04 -0600
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101
Thunderbird/68.12.1
MIME-Version: 1.0
In-Reply-To: <CAFC9CLCHk0WMj935OzZF+HeAdDbv-kGU_SHyi47vohagM+ZmtQ@mail.gmail.com>
X-CMAE-Envelope: MS4xfJ/Jq3XMYBDB7NMgFBCrdejNH/PrdMha11JfQYueJzuMNhKKU++pMspDkFBy0g9SRwfr4if+m0lsMSoX52j7UzMzy5s0zJ4Qp88tCmINQTa2ksWvZJb6
dkHsyX6XCT70WvAc3QaTNbdHdTw2YuBIItuEOpWLp1Jd7CGHmykfNhYUyClaMe14s3PnkM8qhiWkWkfPcnMNzZlpdKKRVNsm2Uc=
X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3,
RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE,
TXREP autolearn=no autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
Reply-To: cygwin AT cygwin DOT com
Errors-To: cygwin-bounces AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 0972Kctx022450

On 2020-10-06 15:36, Jérôme Froissart wrote:
> Thanks for your replies.
> This issue only happens when a program is run from cmd.exe, not from a
> Cygwin bash shell.
> This is important for me, since I discovered this bug in a project
> that must be run from Windows graphical shell (i.e. there is no
> sensible way to run it through Cygwin and Bash).
> 
>> Please show us the output from "uname -a" and "locale" run from the bash prompt.
> 
>> Please provide the results of "locale" command right before running your test
>> binary.
> Here are the more detailed steps to reproduce the issue (along with
> answers to your requests about `uname`, `locale`, etc.).
> (I mostly reproduced what billziss-gh had done before, I do not take
> all the credits :D)
> 
> Here is an example C file

> I have built it with gcc from Cygwin
>     $ gcc -o binary example.c
> 
> Running it from the same Cygwin bash prompt works as expected
>     $ uname -a
>     CYGWIN_NT-10.0 XPS 3.1.5(0.340/5/3) 2020-06-01 08:59 x86_64 Cygwin
>     # (XPS is my Windows machine name)
> 
>     $ locale
>     LANG=fr_FR.UTF-8
>     LC_CTYPE="fr_FR.UTF-8"
>     LC_NUMERIC="fr_FR.UTF-8"
>     LC_TIME="fr_FR.UTF-8"
>     LC_COLLATE="fr_FR.UTF-8"
>     LC_MONETARY="fr_FR.UTF-8"
>     LC_MESSAGES="fr_FR.UTF-8"
>     LC_ALL=
> 
>     $ which gcc
>     /usr/bin/gcc
> 
>     # The following runs as expected
>     $ ./binary.exe "foo bar" "Jérôme"
>     C="C:\Users\Public\binary.exe"
>     0=./binary
>     1=foo bar
>     2=Jérôme
> 
> Now, let's start a Windows shell (cmd.exe)
> Note that I had to copy cygwin1.dll from my Cygwin installation
> directory, otherwise binary.exe would not start.
> I do not know whether there is a `locale` equivalent in Windows
> command prompt, so I merely ran my program.
>     C:\Users\Public>binary.exe "foo bar" "Jérôme"
>     C=binary.exe  "foo bar" "Jâ–¡râ–¡me"
>     0=binary
>     1=foo bar
>     2="Jérôme"
> 
> This behaviour is not expected and is quite inconsistent with what
> happened through Bash.
> Besides the "strange squares" that appear on the first line, and the
> extra space after binary.exe, I especially did not expect "Jérôme" to
> remain quoted as a second argument.
> 
> Sorry for the delay in my answer. I hope this is now clear, please ask
> me for more examples or investigation if you need.
> Thanks for your help.

Create a new or change your current Command Prompt shortcut to run:

	"%windir%\system32\cmd /u"

"/U Causes the output of internal commands to a pipe or file to be Unicode"

and add "chcp 65001":

	"%windir%\system32\cmd /u /k chcp 65001"

or set

	HKEY_LOCAL_MACHINE\Software\Microsoft\Command Processor\Autorun

or

	HKEY_CURRENT_USER\Software\Microsoft\Command Processor\AutoRun

to command

	"@chcp 65001 > nul"

e.g.

	> reg add HKEY_CURRENT_USER\Software\Microsoft\Command Processor ^
		/v AutoRun /d "@chcp 65001 > nul" /f

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]
--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019