X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3378E385188B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1670063385; bh=7lKvE+9hc+XGMj/uAwHE9vEsvcdkDMfKD86F8T+6ogA=; h=Date:To:Cc:Subject:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=yzWlMN2nKmOruiNFuNPJG5ok42X+I30ZnkXIOEzOpbEOgO7rV4jpGw0+fOjI8VYGW NEeNpitgyq38DOUorCsUbW7uL0DOIStlyBTwgteOTpysm3lj9wu+mruw40NeyWCZok Sa9SpjZdaGsBTLU7uUyOIvra3V3zZkfTEh9hbi8M= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2F9363858C62 DKIM-Filter: OpenDKIM Filter v2.10.3 conssluserg-03.nifty.com 2B3ASAh0004724 X-Nifty-SrcIP: [220.150.135.41] Date: Sat, 3 Dec 2022 19:28:10 +0900 To: cygwin AT cygwin DOT com Cc: Ilya Zakharevich Subject: Re: [BUG core?] Regression with parsing =?UTF-8?B?V2luZG93cw==?= =?UTF-8?B?4oCZ?= command-line Message-Id: <20221203192810.03c73015303ef3ad4fe241f3@nifty.ne.jp> In-Reply-To: <20221203034030.a6ghnwcze4rkqeap@math.berkeley.edu> References: <20221116124824 DOT zzobomcsmowvjtbr AT math DOT berkeley DOT edu> <20221203034030 DOT a6ghnwcze4rkqeap AT math DOT berkeley DOT edu> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32) Mime-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Takashi Yano via Cygwin Reply-To: Takashi Yano Content-Type: text/plain; charset="utf-8" Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 2B3AUCaV023382 On Fri, 2 Dec 2022 19:40:30 -0800 Ilya Zakharevich wrote: > On Wed, Nov 16, 2022 at 04:48:25AM -0800, I wrote: > > De-quoting (converting the Windows’ command-line into argc/argv) does > > not remove double quotes if characters not fit for 8-bit (?) are present. > > > > Broken in: CYGWIN_NT-6.1 Bu 3.3.4(0.341/5/3) 2022-01-31 19:35 x86_64 Cygwin > > Works in: CYGWIN_NT-6.1-WOW Bu 2.2.1(0.289/5/3) 2015-08-20 11:40 i686 Cygwin > > > > To reproduce, do in CMD’s command line: > > > > D:\> D:\Programs\cygwin2022\bin\perl -wle "print for @ARGV" . "/i/" "/и/" . > > . > > /i/ > > "/и/" > > . > > I triple-checked > • with a Win10 machine (and a version of cygwin given above), > • with a fresh latest(=test)-cygwin-dll installation on a Win7 (as above) machine. > > Same bug everywhere. This certainly seems to be a problem of cygwin1.dll. Though I am not sure this is the right thing, I have confirmed that the following patch solves the issue. diff --git a/newlib/libc/locale/lctype.c b/newlib/libc/locale/lctype.c index 644669765..732d132e1 100644 --- a/newlib/libc/locale/lctype.c +++ b/newlib/libc/locale/lctype.c @@ -25,11 +25,20 @@ #define LCCTYPE_SIZE (sizeof(struct lc_ctype_T) / sizeof(char *)) +#ifdef __CYGWIN__ +static char numsix[] = { '\6', '\0'}; +#else static char numone[] = { '\1', '\0'}; +#endif const struct lc_ctype_T _C_ctype_locale = { +#ifdef __CYGWIN__ + "UTF-8", /* codeset */ + numsix /* mb_cur_max */ +#else "ASCII", /* codeset */ numone /* mb_cur_max */ +#endif #ifdef __HAVE_LOCALE_INFO_EXTENDED__ , { "0", "1", "2", "3", "4", /* outdigits */ -- Takashi Yano -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple