delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2018/03/21/20:16:04

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:to:from:reply-to:subject:message-id:date
:mime-version:content-type:content-transfer-encoding; q=dns; s=
default; b=auVxQps5VOUBQi7euiGarRjDEAFcbsYX5hA3nzpC6lCGIFtMthJLh
+hkepwBrh0HuJrH7Al3i4wHnK8cgHvhdNYjupnBs0pIBSiivNFlGVwaigJujMR5g
hIEOz7aCT7Rh7JkGLLFwBFZCrz+8LhzwGcYMgEUqJrz3QYTpJsp5Gw=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:to:from:reply-to:subject:message-id:date
:mime-version:content-type:content-transfer-encoding; s=default;
bh=cKUfixgXBBBmUC7yU0GQ4ZGoXno=; b=tDeSg0AL6ZX7sj7oOvvdvYxFW1ZL
eI21xghWg+qHv/WL+OHOejfJmMaz2gvt4qUr9mSYcIuq4eBPKh6F3WnAMF4Mto5Q
OjUVrE0lAsjqvMjpMvC1Ye04zJtwkQqo2YJc9yA4n7oGNdfhTj5kw63pAVj2tYBq
iasI4ywM3iCimCs=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=2.1 required=5.0 tests=BAYES_00,BODY_8BITS,FREEMAIL_FROM,GARBLED_BODY,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS,TIME_LIMIT_EXCEEDED autolearn=unavailable version=3.3.2 spammy=opened, displays, observe, dear
X-HELO: smtp57.i.mail.ru
To: cygwin AT cygwin DOT com
From: "Dmitry Katsubo via cygwin" <cygwin AT cygwin DOT com>
Reply-To: Dmitry Katsubo <dma_k AT mail DOT ru>
Subject: Quotes around command-line argument that has unicode characters are not removed
Message-ID: <08d9621d-b9a0-c0d7-b58b-581ab957a08c@mail.ru>
Date: Thu, 22 Mar 2018 01:15:00 +0100
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
Authentication-Results: smtp57.i.mail.ru; auth=pass smtp.auth=dma_k AT mail DOT ru smtp.mailfrom=dma_k AT mail DOT ru
X-7FA49CB5: 0D63561A33F958A56A2691ED515DA2AFA45E081AE95582F87D54DABB5EB2701D725E5C173C3A84C321259270BBF67A2036438D6BE89A523FCC4B623DB76FBBCBC4224003CC836476C0CAF46E325F83A50BF2EBBBDD9D6B0F05F538519369F3743B503F486389A921A5CC5B56E945C8DA
X-Mailru-Sender: 6DAAA20F2058E07D134D6D8D77B89E7F7C7BC044FC8B061BAF2878A178D57D382941BF4DF226BE62501E7C294F69090ED50E20E2BC48EF5AE609D43F356B221EEAB4BC95F72C04283CDA0F3B3F5B9367
X-IsSubscribed: yes
Note-from-DJ: This may be spam

Dear Cygwin community,

I observe the following on my Cygwin: when I put quotes around file that has
non-ASCII symbols, these quotes are passed to argv of the process literally,
otherwise they are removed. I would expect that there is a consistency.

I have written a small C program that displays arguments, and run it three
times:

#1 For the file with space, taken into quotes ("the file.txt") -- OK
#2 For the file with non-ASCII characters (Château.txt) -- OK
#3 For the file with non-ASCII characters, taken into quotes ("Château.txt") -- WRONG

d:\cli> uname -a
CYGWIN_NT-6.1-WOW PC 2.9.0(0.318/5/3) 2017-09-12 10:41 i686 Cygwin

D:\cli> chcp
Active code page: 866

D:\cli> dir
...cut...
2018-03-22  00:43                 0 Château.txt
2018-03-22  00:01               393 test.c
2018-03-22  00:01           150,230 test.exe
2018-03-21  00:15               186 test.pl
2018-03-22  00:43                 0 the file.txt
2018-03-22  00:40                16 текст плюс.txt
               6 File(s)        150,825 bytes
               2 Dir(s)  41,972,293,632 bytes free

D:\cli> test "the file.txt"
param 0 = test
param 1 = the file.txt
File 'the file.txt' was opened

D:\cli> test Château.txt
param 0 = test
param 1 = Château.txt
File 'Château.txt' was opened

D:\cli> test "Château.txt"
param 0 = test
param 1 = "Château.txt"
Failed to open '"Château.txt"': No such file or directory

As one can see, the last run fails. I am a bit puzzled: how can I pass the name
of the file with space and Unicode symbols? I need to do it in uniform way, as I
am calling a Cygwin program from native Windows program, as in [1].

D:\cli> test "текст плюс.txt"
param 0 = test
param 1 = "текст плюс.txt"
Failed to open '"текст плюс.txt"': No such file or directory

I have search a bit, but I couldn't find a direct answer. From post [1] and [2]
I see that compiler inserts the code to do some argument pre-processing like
@pathnames [3], but what are exactly the rules? Is quote pre-processing done in
dcrt0.cc:177 [4]?

Any feedback is appreciated.

[1] https://sourceware.org/ml/cygwin/2016-05/msg00082.html
[2] http://daviddeley.com/autohotkey/parameters/parameters.htm
[3] https://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-at
[4] https://github.com/openunix/cygwin/blob/master/winsup/cygwin/dcrt0.cc#L177

=== test.c ===
#include <stdio.h>
#include <errno.h>
#include <string.h>

int main(int argc, char* argv[])
{
	for (int i = 0; i < argc; i++)
	{
		printf("param %d = %s\n", i, argv[i]);
	}
	FILE* f = fopen(argv[1], "r");
	if (f != NULL)
	{
		printf("File '%s' was opened\n", argv[1]);
		fclose(f);
	} else {
		printf("Failed to open '%s': %s\n", argv[1], strerror(errno));
	}
	return 0;
}

-- 
With best regards,
Dmitry

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019