X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 63D0B385843A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1670081112; bh=5o94VFtRkCM+ZpeOlmzrIQw5V+7P/9SoRnkZoSBYKtg=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=vqWFq457p6+Lo2YvQ1x5O2cXtnR+4/eXeGAg04ptswM80x9emNhE/cARFED27JFlo j929gYpqYvLsQYhOe3RtQSoX+TqBRWcgBECCBdn2mwPD7m3RWBPDMeslEakh49q7go mmZm01j8XNq5xeGLpAzQ/ZiK63j94IVN9hQ68OpI= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com Date: Sat, 3 Dec 2022 16:24:53 +0100 To: cygwin AT cygwin DOT com Subject: Re: [BUG core?] Regression =?utf-8?Q?with_?= =?utf-8?Q?parsing_Windows=E2=80=99?= command-line Message-ID: Mail-Followup-To: cygwin AT cygwin DOT com References: <20221116124824 DOT zzobomcsmowvjtbr AT math DOT berkeley DOT edu> <20221203034030 DOT a6ghnwcze4rkqeap AT math DOT berkeley DOT edu> <20221203192810 DOT 03c73015303ef3ad4fe241f3 AT nifty DOT ne DOT jp> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20221203192810.03c73015303ef3ad4fe241f3@nifty.ne.jp> X-Provags-ID: V03:K1:JDOIwURlTFcyt7op9sbC+HzzO0e+TpDaFflXT75gGS83nj2SwK8 PPhPX3+QlJiqI0W3KIY0JECIp//EssZLdmyIWjn99kcoYmTIRr/92E8ZtrcD1YUNAnptXJZ s2zYIpeFGvtAC/S0OziQ/PuatUsCrnbAsZ71nGTCgvq7PMYDeyFh9Fj9m/KsYzq2k/BWcSF LfR94TP3sudIhZ6/daKBw== X-UI-Out-Filterresults: notjunk:1;V03:K0:MK7koHmj03A=:jmllCn96v2G77j1m01U5KI DWHthlsolT3OeRUKbUkdaqYtg/tUYIOd4tKD5Oy8ichnquK49HX1uVRa2YcW7ZRtKXmq0qCjW O0nHDnu8CSnmU6eS6mAlQM01qBRAMzkyuzUh3czxzBODZNxi4ShP8cUPl7YlX+FfR/N9bjmol mi3ZcLRvQM2OdYjKSFAk61Nnv0laqAGpFE4ubAiNJGY/C2GFh1q1i+VV7g2zR0WM+cql5IKQX o8CQWoxv73dmibBHO1wiEKTURIZVfBga0QjXVEX0tMppmXt5xA/J+JI4eljPS0dDQqGCXxdfn byZc1o8gacoO6EsOEm3rF8pcB0RmY3REW65iRPsOJNpObSqLzkMmFd9U4Eh7m/4jJa+vMYjuv aX0ksSAQqq5mqi+2ywHayv6WyaxIut22poAnNXdHgAmVx3gqKGY/J2FiZR7g60BaVPK8jjsns nQ4EjEm2qBTKn5i6a5FoDwkJw9mGSh+WzdwzzUGxmipcakqqYjbIsXs610+KfRls6B4PKVhfc IPlAEFdr+z4J6vVTW3H1Eh0AULw8BHgzEDs2rSWliwNhOjAZy9k0ip2LtPN5QB9C0kLEUWs1C dJEqBioAQWlV+1CkPcWIu2bnvcVIrHGtatOsxsXWTDyRq9OTCrB1VMURHSlGpceWwTHAfEGa9 cgLOL8ZxHyuyndC6/ytd+bsLhdQVmpGutfrb6yD/9q79RQWyKQYAkVKVvQxHbg9il1Vd6NDxh 02UIymkCaY5kXMoRdOf6GXb8iWYSLo+NgrzePrhjZgRKfcQzPitkDWawyBm28Ox2YhqgQXw6f 8TWCkTGW1WKA82VSzFwSnyPdbkXZg== X-Spam-Status: No, score=-101.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, GOOD_FROM_CORINNA_CYGWIN, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Corinna Vinschen via Cygwin Reply-To: cygwin AT cygwin DOT com Cc: Corinna Vinschen Content-Type: text/plain; charset="utf-8" Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 2B3FPbGj008557 On Dec 3 19:28, Takashi Yano via Cygwin wrote: > On Fri, 2 Dec 2022 19:40:30 -0800 > Ilya Zakharevich wrote: > > On Wed, Nov 16, 2022 at 04:48:25AM -0800, I wrote: > > > De-quoting (converting the Windows’ command-line into argc/argv) does > > > not remove double quotes if characters not fit for 8-bit (?) are present. > > > > > > To reproduce, do in CMD’s command line: > > > > > > D:\> D:\Programs\cygwin2022\bin\perl -wle "print for @ARGV" . "/i/" "/и/" . > > > . > > > /i/ > > > "/и/" > > > . > > [...] > This certainly seems to be a problem of cygwin1.dll. > > Though I am not sure this is the right thing, I have confirmed > that the following patch solves the issue. > > diff --git a/newlib/libc/locale/lctype.c b/newlib/libc/locale/lctype.c > index 644669765..732d132e1 100644 > --- a/newlib/libc/locale/lctype.c > +++ b/newlib/libc/locale/lctype.c > @@ -25,11 +25,20 @@ > > #define LCCTYPE_SIZE (sizeof(struct lc_ctype_T) / sizeof(char *)) > > +#ifdef __CYGWIN__ > +static char numsix[] = { '\6', '\0'}; > +#else > static char numone[] = { '\1', '\0'}; > +#endif > > const struct lc_ctype_T _C_ctype_locale = { > +#ifdef __CYGWIN__ > + "UTF-8", /* codeset */ > + numsix /* mb_cur_max */ > +#else > "ASCII", /* codeset */ > numone /* mb_cur_max */ > +#endif Good idea, but this transforms the "C" locale into the "C.UTF-8" locale once and for all. What we're actually missing is a matching _C_utf8_ctype_locale which can be used by Cygwin as default locale setting, AFAICS. I pushed a patch and the test release is rebuilding while I type. Thanks, Corinna -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple