X-Recipient: archive-cygwin@delorie.com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6A164384B110
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
	s=default; t=1603074738;
	bh=fyUOCDV98Ad5+4ZrlkXx+zOfRSbgAyUy9ZBtdIBcZsM=;
	h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe:
	 List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
	 From;
	b=wrQWI6EIiTEkfcNHjYQiVmMZlHkaIeUL7GBu9hHMBT/5Y78NfJPRCeW7R4zJx0Qtp
	 bQ6qfbKbZbeFTYilKFWwhadhyyq4MESvLpYU11lp7ilHtFLaorSmdSJ4fX9uCQz7yz
	 DsuXq8leg4Utu82ZOcpeInGAJqfbZRF8njbvGpO4=
X-Original-To: cygwin@cygwin.com
Delivered-To: cygwin@cygwin.com
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A7ECD3858020
X-Authority-Analysis: v=2.4 cv=NYRYa0P4 c=1 sm=1 tr=0 ts=5f8cfaad
 a=95A0EdhkF1LMGt25d7h1IQ==:117 a=95A0EdhkF1LMGt25d7h1IQ==:17
 a=IkcTkHD0fZMA:10 a=SMorJkV_YP8A:10 a=afefHYAZSVUA:10
 a=vpfx9Xo2_UxmPigGMZAA:9 a=QEXdDO2ut3YA:10
To: =?UTF-8?Q?J=C3=A9r=C3=B4me_Froissart?= <software@froissart.eu>
Subject: Re: Unconsistent command-line parsing in case of UTF-8 quoted
 arguments
X-PHP-Originating-Script: 501:rcmail.php
MIME-Version: 1.0
Date: Sun, 18 Oct 2020 19:32:11 -0700
In-Reply-To: <CAFC9CLCx3nAQu6aMYTTL1syr9zyXgHYY0vKCKSCXAf=HpYXDiQ@mail.gmail.com>
References: <CAFC9CLCtfMORMxAK6==jdwY5ZbX6jWwo+JCfDwM3njgvGduf0w@mail.gmail.com>
 <634821436.20201004141809@yandex.ru>
 <CAFC9CLCHk0WMj935OzZF+HeAdDbv-kGU_SHyi47vohagM+ZmtQ@mail.gmail.com>
 <d4f283fe85c31be76dcfc01b20bb375e@mail.kylheku.com>
 <CAFC9CLCx3nAQu6aMYTTL1syr9zyXgHYY0vKCKSCXAf=HpYXDiQ@mail.gmail.com>
Message-ID: <6a30ae30f769cba0dbf7a80423c20ac1@mail.kylheku.com>
X-Sender: 743-406-3965@kylheku.com
User-Agent: Roundcube Webmail/0.9.2
X-CMAE-Envelope: MS4xfClbFHkn0Fz3byar/+cKhxmUuXYmARsvOOuypKgHOCippCOXCBw7KVZ3HCk0kPRvB89bYxwiAf9KyLixK5vTG0JFQtS1WBJrbCGwjU+vNSTqgZ/7AAMm
 G6K0WK/5xD73pMugUBtzSEU7yLTHPoW7f59rmmlbWX4uQ/eAvoJgoypxn/85VREg4N6tT9IHbb0ujnxnWFZWaBpcUsnRkVzFWqtehGGPfQOSOxdPFpcP2QD6
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,
 FROM_STARTS_WITH_NUMS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: cygwin@cygwin.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-request@cygwin.com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=subscribe>
From: "Kaz Kylheku \(Cygwin\) via Cygwin" <cygwin@cygwin.com>
Reply-To: "Kaz Kylheku \(Cygwin\)" <743-406-3965@kylheku.com>
Cc: cygwin@cygwin.com
Content-Type: text/plain; charset="utf-8"; Format="flowed"
Errors-To: cygwin-bounces@cygwin.com
Sender: "Cygwin" <cygwin-bounces@cygwin.com>
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 09J2Wj4m002157

On 2020-10-14 14:47, Jérôme Froissart wrote:
>> The choice of GetCommandLineA was for illustration purposes;
>> had I used GetCommandLineW I would not be able to printf
>> using %ls under CMD.EXE, because of code page issues. However
>> here is a modified version of the test program that uses
>> GetCommandLineW.

[ ... ]

>> billziss@xps:~/Projects/t$ ./cyg.exe "foo bar" "Domain\Jérôme"
>> 0022 "   0043 C   003a :   005c \   0055 U   0073 s   0065 e   0072 r
>> 0073 s   005c \   0062 b   0069 i   006c l   006c l   007a z   0069 i
>> 0073 s   0073 s   005c \   0050 P   0072 r   006f o   006a j   0065 e
>> 0063 c   0074 t   0073 s   005c \   0074 t   005c \   0063 c   0079 y
>> 0067 g   002e .   0065 e   0078 x   0065 e   0022 "

[ ... ]

>> C:\Users\billziss\Projects\t>cyg.exe "foo bar" "Domain\Jérôme"
>> 0063 c   0079 y   0067 g   002e .   0065 e   0078 x   0065 e   0020
>> 0020     0022 "   0066 f   006f o   006f o   0020     0062 b   0061 a
>> 0072 r   0022 "   0020     0022 "   0044 D   006f o   006d m   0061 a
>> 0069 i   006e n   005c \   004a J   00e9 .   0072 r   00f4 .   006d m
>> 0065 e   0022 "

Aha! There is a hint of a problem here. Firstly, the command lines
are obviously different.

The Cygwin one starts with a quote that we did not see, wrapping
the full path to the executable:

   "C:\Users\billziss\Projects\t\cyg.exe"

It ends there. Why is that? I'm guessing that the command line was
tokenized destructively; a null character was written.

But under cmd.exe, we see the whole command line, without any null
character having been written in it. Moreover, the program name just
appears as the original relative path cyg.exe with no quotes.

What a mess. :)



--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

