delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2018/03/22/08:25:01

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:reply-to:to:cc:subject:message-id
:in-reply-to:references:mime-version:content-type
:content-transfer-encoding; q=dns; s=default; b=VKcxuts+uAMi/MJa
PormHkAyqLovQuMzhKBc56f3QOfyT4D2j60cQzu97lJA0p4HVJQYT+p5ez97GHHv
pp6MTQa6ahqnFIq3G7eaIxmTp8UVoBe8VtE2Wov9gixc5QrOsbIlznjyOt2dEN4d
xPAGGDYbUgEjbqzwMdZn323N3Zo=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:reply-to:to:cc:subject:message-id
:in-reply-to:references:mime-version:content-type
:content-transfer-encoding; s=default; bh=ch+m+W5xTdjWDTTx88OrBF
w0c+g=; b=THA4+pGuASQX0zooUmDKLE3DzccFLGw361tXjnpsOBxYF3GUj7Uve+
dxpqBVnMqtVfWXeyAcqE1e8wV51J3LmIxUaImKOGQuAASRU27zKupKzedg+K7uzq
DobgJvLppr9eMORB9HpaS097FyE4xvpq6ELZHoXhOiZwPS4CuE7LM=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=0.3 required=5.0 tests=AWL,BAYES_50,BODY_8BITS,FREEMAIL_FROM,GARBLED_BODY,GIT_PATCH_2,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=opened, displays, Dmitry, Katsubo
X-HELO: smtp63.i.mail.ru
Date: Thu, 22 Mar 2018 15:24:37 +0300
From: "Mikhail Usenko via cygwin" <cygwin AT cygwin DOT com>
Reply-To: Mikhail Usenko <cygwin AT inbox DOT ru>
To: cygwin AT cygwin DOT com
Cc: Dmitry Katsubo <dma_k AT mail DOT ru>
Subject: Re: Quotes around command-line argument that has unicode characters are not removed
Message-Id: <20180322152437.a37c3dd3b778bba765e2124c@inbox.ru>
In-Reply-To: <08d9621d-b9a0-c0d7-b58b-581ab957a08c@mail.ru>
References: <08d9621d-b9a0-c0d7-b58b-581ab957a08c AT mail DOT ru>
Mime-Version: 1.0
Authentication-Results: smtp63.i.mail.ru; auth=pass smtp.auth=cygwin AT inbox DOT ru smtp.mailfrom=cygwin AT inbox DOT ru
X-7FA49CB5: 0D63561A33F958A5B278720BC4FE4CBF8726715E7D73D2BBB0D202E9FE370A48725E5C173C3A84C3BBEB9517EB7DA65B6DC4115F85331DDCA3CCBC2573AEBDE1C4224003CC836476C0CAF46E325F83A50BF2EBBBDD9D6B0F5D41B9178041F3E72623479134186CDE6BA297DBC24807EABDAD6C7F3747799A
X-Mailru-Sender: 6EC2BC89932334D1C4882107FE46CFD9917A976CB42A8B6C370CBDC3EF75A26F571D99010C863CB857E81BA083882096793588E7E1BA3EF16610D2B9C0EC4D78B1D210AF280BDE3A67452AF1AC6CC01554A42CAEBACFBF7EAE208404248635DF
X-IsSubscribed: yes
Note-from-DJ: This may be spam

On Thu, 22 Mar 2018 01:15:00 +0100
Dmitry Katsubo via cygwin <...> wrote:

> Dear Cygwin community,
> 
> I observe the following on my Cygwin: when I put quotes around file that has
> non-ASCII symbols, these quotes are passed to argv of the process literally,
> otherwise they are removed. I would expect that there is a consistency.
> 
> I have written a small C program that displays arguments, and run it three
> times:
> 
> #1 For the file with space, taken into quotes ("the file.txt") -- OK
> #2 For the file with non-ASCII characters (Château.txt) -- OK
> #3 For the file with non-ASCII characters, taken into quotes ("Château.txt") -- WRONG
> 
> d:\cli> uname -a
> CYGWIN_NT-6.1-WOW PC 2.9.0(0.318/5/3) 2017-09-12 10:41 i686 Cygwin
> 
> D:\cli> chcp
> Active code page: 866
> 
> D:\cli> dir
> ...cut...
> 2018-03-22  00:43                 0 Château.txt
> 2018-03-22  00:01               393 test.c
> 2018-03-22  00:01           150,230 test.exe
> 2018-03-21  00:15               186 test.pl
> 2018-03-22  00:43                 0 the file.txt
> 2018-03-22  00:40                16 текст плюс.txt
>                6 File(s)        150,825 bytes
>                2 Dir(s)  41,972,293,632 bytes free
> 
> D:\cli> test "the file.txt"
> param 0 = test
> param 1 = the file.txt
> File 'the file.txt' was opened
> 
> D:\cli> test Château.txt
> param 0 = test
> param 1 = Château.txt
> File 'Château.txt' was opened
> 
> D:\cli> test "Château.txt"
> param 0 = test
> param 1 = "Château.txt"
> Failed to open '"Château.txt"': No such file or directory
> 
> As one can see, the last run fails. I am a bit puzzled: how can I pass the name
> of the file with space and Unicode symbols? I need to do it in uniform way, as I
> am calling a Cygwin program from native Windows program, as in [1].
> 
> D:\cli> test "текст плюс.txt"
> param 0 = test
> param 1 = "текст плюс.txt"
> Failed to open '"текст плюс.txt"': No such file or directory
> 
> I have search a bit, but I couldn't find a direct answer. From post [1] and [2]
> I see that compiler inserts the code to do some argument pre-processing like
> @pathnames [3], but what are exactly the rules? Is quote pre-processing done in
> dcrt0.cc:177 [4]?
> 
> Any feedback is appreciated.
> 
> [1] https://sourceware.org/ml/cygwin/2016-05/msg00082.html
> [2] http://daviddeley.com/autohotkey/parameters/parameters.htm
> [3] https://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-at
> [4] https://github.com/openunix/cygwin/blob/master/winsup/cygwin/dcrt0.cc#L177
> 
> === test.c ===
> #include <stdio.h>
> #include <errno.h>
> #include <string.h>
> 
> int main(int argc, char* argv[])
> {
> 	for (int i = 0; i < argc; i++)
> 	{
> 		printf("param %d = %s\n", i, argv[i]);
> 	}
> 	FILE* f = fopen(argv[1], "r");
> 	if (f != NULL)
> 	{
> 		printf("File '%s' was opened\n", argv[1]);
> 		fclose(f);
> 	} else {
> 		printf("Failed to open '%s': %s\n", argv[1], strerror(errno));
> 	}
> 	return 0;
> }
> 
> -- 

Hello, Dmintry,
consider these test cases:

Native (msvcrt) binary:
-----------------------
$ x86_64-w64-mingw32-gcc test.c -o test-win.exe
$ ldd test-win.exe
        ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7fa05900000)
        KERNEL32.DLL => /cygdrive/c/Windows/system32/KERNEL32.DLL (0x7fa030e0000)
        KERNELBASE.dll => /cygdrive/c/Windows/system32/KERNELBASE.dll (0x7fa028f0000)
        msvcrt.dll => /cygdrive/c/Windows/system32/msvcrt.dll (0x7fa03220000)
-----------------------

Cygwin-flavor binary:
---------------------
$ gcc test.c -o test-cygwin.exe
$ ldd test-cygwin.exe
        ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7fa05900000)
        KERNEL32.DLL => /cygdrive/c/Windows/system32/KERNEL32.DLL (0x7fa030e0000)
        KERNELBASE.dll => /cygdrive/c/Windows/system32/KERNELBASE.dll (0x7fa028f0000)
        cygwin1.dll => /usr/bin/cygwin1.dll (0x180040000)
---------------------

Create a file with non-ascii chars in the name:
-----------------------------------------------
$ touch "текст плюс.txt"
-----------------------------------------------

Run both binaries in mintty with bash:
--------------------------------------
$ ./test-win "текст плюс.txt"
param 0 = D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed\test-win.exe
param 1 = ▒▒▒▒▒ ▒▒▒▒.txt
File '▒▒▒▒▒ ▒▒▒▒.txt' was opened
$ ./test-cygwin "текст плюс.txt"
param 0 = ./test-cygwin
param 1 = текст плюс.txt
File 'текст плюс.txt' was opened
--------------------------------------

Run the binaries in cmd.exe with bash:
--------------------------------------
$ ./test-win "текст плюс.txt"
param 0 = D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed\test-win.exe
param 1 = ЄхъёЄ яы■ё.txt
File 'ЄхъёЄ яы■ё.txt' was opened
$ ./test-cygwin "текст плюс.txt"
param 0 = ./test-cygwin
param 1 = текст плюс.txt
File 'текст плюс.txt' was opened
--------------------------------------

Run in bare cmd.exe
(/usr/bin/cygwin1.dll should be copied next to ./test-cygwin.exe)
-------------------
D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed>.\test-win.exe "текст плюс.txt"
param 0 = .\test-win.exe
param 1 = ЄхъёЄ яы■ё.txt
File 'ЄхъёЄ яы■ё.txt' was opened
D:\wroot\test.cygwin\Quotes around command-line argument that has unicode characters are not removed>.\test-cygwin.exe "текст плюс.txt"
param 0 = ./test-cygwin
param 1 = "текст плюс.txt"
Failed to open '"текст плюс.txt"': No such file or directory
-------------------

In bare cmd.exe native-msvcrt binary is working OK with quoted non-ascii
arguments, while cygwin-flavor binary is not. But I don't know exactly which
level here: cmd.exe or msvcrt.dll/cygwin1.dll is responsible for
such a behavior.


-- 


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019