delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2020/10/02/17:41:12

X-Recipient: archive-cygwin AT delorie DOT com
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D1F433857C71
Authentication-Results: sourceware.org;
dmarc=none (p=none dis=none) header.from=froissart.eu
Authentication-Results: sourceware.org;
spf=pass smtp.mailfrom=jerome DOT froissart AT gmail DOT com
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:mime-version:from:date:message-id:subject:to
:content-transfer-encoding;
bh=rcWZCuGS0YKPMQlE06DRBm80FhKEOI6JPNn/VSsr2A4=;
b=TRYADRsdTx1RMTacfLKu0utKTMDdRwHjJ1oLbuKiOVjI867YQnQVKjL4srOa+Rhd3R
zKWMQqonQck4UsaJCNf5sTv0p0QV7GppTzJ9nQNHsJBooo7W47t4F5mvb6xc1PoaRsi/
dN/PyvnU4/p5KXzbDpaHleHYNPpoQxzWYzjFSFV7GFU7sCiTm+CNFX6AQBtgHQJuyzyN
mwN12LhsR5Zrq0mrJ7jQc/tK8tBsbFgNiCf2k/kahmU48Qwsc5x5DbhMBt9fBVVUjJOK
jBUvk0usduT60wQFG7ZHsBE8698iwGBdVh8fuSQBzuJvsOBc+X63w/hqan3WuP4ll+M0
lo2w==
X-Gm-Message-State: AOAM530VFvm2OwNa3SD9fhDQPCYZawtIde5og9zcDLDWcrNA0tazPgf4
04HAZ7sxEWSeiHiKLidxbQp7/iNSBi+vHK+AxDwf6ratKxg=
X-Google-Smtp-Source: ABdhPJz6G1gpnH2Wi9D7MyZGHyoYti+FeiWHLZPvRl69BvimZ0Y22T+6Oauq6hGv+Mgrx8FvEytOsyns9nc1HEMaxuU=
X-Received: by 2002:a17:906:f8d5:: with SMTP id
lh21mr4067859ejb.185.1601674822599;
Fri, 02 Oct 2020 14:40:22 -0700 (PDT)
MIME-Version: 1.0
From: =?UTF-8?B?SsOpcsO0bWUgRnJvaXNzYXJ0?= <software AT froissart DOT eu>
Date: Fri, 2 Oct 2020 23:40:12 +0200
Message-ID: <CAFC9CLCtfMORMxAK6==jdwY5ZbX6jWwo+JCfDwM3njgvGduf0w@mail.gmail.com>
Subject: Unconsistent command-line parsing in case of UTF-8 quoted arguments
To: cygwin AT cygwin DOT com
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,
FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,
KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,
SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
Sender: "Cygwin" <cygwin-bounces AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 092LeqdY006418

Hello,

By discussing a merge request on another project [1], I think
billziss-gh found a weirdness in the way Cygwin parses the command
line arguments when non-ASCII characters come into play.

EXPECTED BEHAVIOUR:
cygwin should parse the following command line
    binary.exe --non-ascii "charaçtérs" --ascii "nothing-fancy-here"
as
    argv = ["binary.exe",
            "--non-ascii",
            "chara\xXX\xXXt\xXX\xXXrs",
            "--ascii",
            "nothing-fancy-here"]
    // \xXX\xXX being the UTF-8 encoding of the special characters,
but this does not really matter here
before calling main()

ACTUAL BEHAVIOUR:
it parses it as
    argv = ["binary.exe",
            "--non-ascii",
            "\"chara\xXX\xXXt\xXX\xXXrs\"", // mind the unstripped
quotes here...
            "--ascii",
            "nothing-fancy-here" // ...but not here
    ]

It looks that words containing UTF-8 characters are not properly
stripped when they are surrounded by quotes, unlinke ASCII words.

More examples and a better description is available at [1] (thanks to
billziss-gh for his analysis, much more thorough than mine)
For the record, we wrote a work-around in our specific program, but
handling this issue in Cygwin might be a better way to solve it.

[1]: https://github.com/billziss-gh/sshfs-win/pull/208 (Checking for
quotes around non-ascii usernames passed by Windows)

Thanks for your help! In case you didn't have time, please tell me
where to look at, and I might try to fix it myself and send a patch
proposal if that is easy enough (I have never read Cygwin's code yet).
Jérôme
--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019