Mail Archives: cygwin/2022/12/03/08:44:19
X-Recipient: | archive-cygwin AT delorie DOT com
|
DKIM-Filter: | OpenDKIM Filter v2.11.0 sourceware.org 40291385843A
|
DKIM-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
|
| s=default; t=1670075014;
|
| bh=zta564nG+DE5fu4q95xxsGLPRml8nEUoVTC5Bj6WI2Q=;
|
| h=Date:To:Subject:In-Reply-To:References:List-Id:List-Unsubscribe:
|
| List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
|
| From;
|
| b=fmiVfnCwpEU8Dj415pb+hg1AV94VGE3jyTrXdlM4WO984DIGNcUgL6Ht05DD0a5FN
|
| ThEx0WuwFUUpFevgERZxoKD+7/F34h3+uq36w9b1rLIU/y/SAWv3nn+q0qVI5EELer
|
| gQ2IM+ul7230RDFZHPVaAdlPC4hGb/39abltPxas=
|
X-Original-To: | cygwin AT cygwin DOT com
|
Delivered-To: | cygwin AT cygwin DOT com
|
DMARC-Filter: | OpenDMARC Filter v1.4.1 sourceware.org CA2653858D20
|
DKIM-Filter: | OpenDKIM Filter v2.10.3 conssluserg-03.nifty.com 2B3Dgk2L006437
|
X-Nifty-SrcIP: | [220.150.135.41]
|
Date: | Sat, 3 Dec 2022 22:42:46 +0900
|
To: | cygwin AT cygwin DOT com
|
Subject: | Re: [BUG core?] Regression with parsing =?UTF-8?B?V2luZG93cw==?=
|
| =?UTF-8?B?4oCZ?= command-line
|
Message-Id: | <20221203224246.e81fcbb5ba989a4a7c25ddde@nifty.ne.jp>
|
In-Reply-To: | <20221203192810.03c73015303ef3ad4fe241f3@nifty.ne.jp>
|
References: | <20221116124824 DOT zzobomcsmowvjtbr AT math DOT berkeley DOT edu>
|
| <20221203034030 DOT a6ghnwcze4rkqeap AT math DOT berkeley DOT edu>
|
| <20221203192810 DOT 03c73015303ef3ad4fe241f3 AT nifty DOT ne DOT jp>
|
X-Mailer: | Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32)
|
Mime-Version: | 1.0
|
X-Spam-Status: | No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED,
|
| DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A,
|
| RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
|
| TXREP autolearn=ham autolearn_force=no version=3.4.6
|
X-Spam-Checker-Version: | SpamAssassin 3.4.6 (2021-04-09) on
|
| server2.sourceware.org
|
X-BeenThere: | cygwin AT cygwin DOT com
|
X-Mailman-Version: | 2.1.29
|
List-Id: | General Cygwin discussions and problem reports <cygwin.cygwin.com>
|
List-Unsubscribe: | <https://cygwin.com/mailman/options/cygwin>,
|
| <mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
|
List-Archive: | <https://cygwin.com/pipermail/cygwin/>
|
List-Post: | <mailto:cygwin AT cygwin DOT com>
|
List-Help: | <mailto:cygwin-request AT cygwin DOT com?subject=help>
|
List-Subscribe: | <https://cygwin.com/mailman/listinfo/cygwin>,
|
| <mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
|
From: | Takashi Yano via Cygwin <cygwin AT cygwin DOT com>
|
Reply-To: | Takashi Yano <takashi DOT yano AT nifty DOT ne DOT jp>
|
Errors-To: | cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
|
Sender: | "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>
|
X-MIME-Autoconverted: | from base64 to 8bit by delorie.com id 2B3Di0WS025612
|
On Sat, 3 Dec 2022 19:28:10 +0900
Takashi Yano wrote:
> On Fri, 2 Dec 2022 19:40:30 -0800
> Ilya Zakharevich wrote:
> > On Wed, Nov 16, 2022 at 04:48:25AM -0800, I wrote:
> > > De-quoting (converting the Windows’ command-line into argc/argv) does
> > > not remove double quotes if characters not fit for 8-bit (?) are present.
> > >
> > > Broken in: CYGWIN_NT-6.1 Bu 3.3.4(0.341/5/3) 2022-01-31 19:35 x86_64 Cygwin
> > > Works in: CYGWIN_NT-6.1-WOW Bu 2.2.1(0.289/5/3) 2015-08-20 11:40 i686 Cygwin
> > >
> > > To reproduce, do in CMD’s command line:
> > >
> > > D:\> D:\Programs\cygwin2022\bin\perl -wle "print for @ARGV" . "/i/" "/и/" .
> > > .
> > > /i/
> > > "/и/"
> > > .
> >
> > I triple-checked
> > • with a Win10 machine (and a version of cygwin given above),
> > • with a fresh latest(=test)-cygwin-dll installation on a Win7 (as above) machine.
> >
> > Same bug everywhere.
>
> This certainly seems to be a problem of cygwin1.dll.
>
> Though I am not sure this is the right thing, I have confirmed
> that the following patch solves the issue.
>
> diff --git a/newlib/libc/locale/lctype.c b/newlib/libc/locale/lctype.c
> index 644669765..732d132e1 100644
> --- a/newlib/libc/locale/lctype.c
> +++ b/newlib/libc/locale/lctype.c
> @@ -25,11 +25,20 @@
>
> #define LCCTYPE_SIZE (sizeof(struct lc_ctype_T) / sizeof(char *))
>
> +#ifdef __CYGWIN__
> +static char numsix[] = { '\6', '\0'};
> +#else
> static char numone[] = { '\1', '\0'};
> +#endif
>
> const struct lc_ctype_T _C_ctype_locale = {
> +#ifdef __CYGWIN__
> + "UTF-8", /* codeset */
> + numsix /* mb_cur_max */
> +#else
> "ASCII", /* codeset */
> numone /* mb_cur_max */
> +#endif
> #ifdef __HAVE_LOCALE_INFO_EXTENDED__
> ,
> { "0", "1", "2", "3", "4", /* outdigits */
The patch above also affects __C_locale.
The patch below should be more appropriate.
diff --git a/newlib/libc/locale/locale.c b/newlib/libc/locale/locale.c
index e523d2366..7485ac292 100644
--- a/newlib/libc/locale/locale.c
+++ b/newlib/libc/locale/locale.c
@@ -244,6 +244,21 @@ const struct __locale_t __C_locale =
};
#endif /* _MB_CAPABLE */
+#ifdef __CYGWIN__
+static char numsix[] = { '\6', '\0'};
+static const struct lc_ctype_T _C_UTF8_ctype_locale = {
+ "UTF-8", /* codeset */
+ numsix /* mb_cur_max */
+#ifdef __HAVE_LOCALE_INFO_EXTENDED__
+ ,
+ { "0", "1", "2", "3", "4", /* outdigits */
+ "5", "6", "7", "8", "9" },
+ { L"0", L"1", L"2", L"3", L"4", /* woutdigits */
+ L"5", L"6", L"7", L"8", L"9" }
+#endif
+};
+#endif
+
struct __locale_t __global_locale =
{
{ "C", "C", DEFAULT_LOCALE, "C", "C", "C", "C", },
@@ -272,10 +287,11 @@ struct __locale_t __global_locale =
{ NULL, NULL }, /* LC_ALL */
#ifdef __CYGWIN__
{ &_C_collate_locale, NULL }, /* LC_COLLATE */
+ { &_C_UTF8_ctype_locale, NULL }, /* LC_CTYPE */
#else
{ NULL, NULL }, /* LC_COLLATE */
-#endif
{ &_C_ctype_locale, NULL }, /* LC_CTYPE */
+#endif
{ &_C_monetary_locale, NULL }, /* LC_MONETARY */
{ &_C_numeric_locale, NULL }, /* LC_NUMERIC */
{ &_C_time_locale, NULL }, /* LC_TIME */
--
Takashi Yano <takashi DOT yano AT nifty DOT ne DOT jp>
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
- Raw text -