delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2025/07/24/13:46:10

DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 56OHkACn1546513
Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com
Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com
DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 56OHkACn1546513
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=V0W8iMnO
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 872C0385B52F
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1753379169;
bh=ZPlxZU38hiEEJ2vNaFzavFnMFWJa2w3+9jCOj+vPXNU=;
h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=V0W8iMnOZ+Ax/j4KmwhwcRPOO2/AkZ2YcU7KqIAI65IX+2jqIG9DFw4a3+E08xtoz
LKpKtv4qVkV5VxsiXLmta+kiW0xTI++gPZkuf6vxeZsFpMS6bsf2PoJsN2xUWSZkFm
c1QE975R47Jvz9vyQTuuXHvKA77T7/e1ZwMHeX3g=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B8D64385B516
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B8D64385B516
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1753379118; cv=none;
b=m2AMvUqebTKX213McxRIMM82ctAR3bxa0gEoPEKMedLdbe+UYVqq0Qylz+WVkCwNAR15ao1L/N+pxIkUdw8EC9Nev9L7jiW+JF9k7F7WwHCdDrPXtB5hjKANR9rjmSDzs1xh5V/2DlQVymI4Qoxpz4ipoT53ePM6e/Xs64enBlY=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1753379118; c=relaxed/simple;
bh=t5QrU1ge8BUYKt32LpOQfj4Tt9/aQhaE6KPFc1o9fOM=;
h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From;
b=bHtBc6UA8f+ECaTPR6udglEjDGdIdut59tGvjvUcyZtsQ1sJnR+M06WfoY1a5iurpJWuP8QPd2v3PpPueQeRD8NVyT+pdkvIjdQ3HZnNObxFL3FeNXO+lXWMiHulh1nvsaVBpBjFa3zuG+djCbB53zK9k3YzrVCakJXilnKuvtg=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B8D64385B516
X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6
Message-ID: <a41d289a-c440-4616-967c-850d7b7679d6@towo.net>
Date: Thu, 24 Jul 2025 19:45:16 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: readdir() returns inaccessible name if file was created with
invalid UTF-8
To: cygwin AT cygwin DOT com
References: <96f2253b-791b-b8a0-97dd-8d257eefb9b1 AT t-online DOT de>
<03c4fae7-7322-572c-ae72-52e300f0b438 AT t-online DOT de>
<aFxRfI4NdZ8y5IlK AT calimero DOT vinschen DOT de>
<f78c615c-aefe-b3d0-aada-5f9d0cf73a0a AT t-online DOT de>
<aF5y15iQ840LxLYJ AT calimero DOT vinschen DOT de>
<ca205dbd-907f-4552-9e5c-2cb0050f83a3 AT towo DOT net>
<aH-MtwqARmDmLwoo AT calimero DOT vinschen DOT de>
<91f26856-72b0-483b-8d04-bd90a27b6be0 AT towo DOT net>
<4ab2c1b7-3164-4556-ba36-29814ecf5766 AT towo DOT net>
<68f65634-8f4e-436b-ba6a-d30bdf882aaa AT towo DOT net>
<aIJSqk4abV6QdeVS AT calimero DOT vinschen DOT de>
Autocrypt: addr=towo AT towo DOT net; keydata=
xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu
KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ
bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK
fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11
Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1
zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN
BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m
7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB
BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID
AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9
+AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve
5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY
GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72
5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B
I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4
uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka
UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded
KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc
k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te
D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD
G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ
0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk
UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P
agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA
IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF
mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW
6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F
PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa
QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF
FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP
y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP
iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ==
In-Reply-To: <aIJSqk4abV6QdeVS@calimero.vinschen.de>
X-Provags-ID: V03:K1:xaUQuY5ciZkwHkNp3OYBO6HWoeX3tqpjsQyzVEQPCmJkgOT0f6K
Xpig3yqce7OUucp2AK1W4FWvl+lLdfDKsDGIq8bjAkU5V0hp9xaSlCpk+zp5SvdxuJRWPEI
AceEVGw+pcPAUC4oXI5tKPGMaIu/NCtVf6Cqg2TChfTImEeUUBnVw46fpt47fHPcwP+hBRF
E/YSBwAdZsphJHOn8vUYQ==
UI-OutboundReport: notjunk:1;M01:P0:ThC92YLcnlI=;pCDlvk/9Q3dst6ko+AWqfitORr+
+gv+Yf+eT6T2e0uVrn9PTLvPbPmbalF9sEoZTGW1olPiDTvzKAkKlJi+Wu9+b+ur3ZLNVIqOZ
Dvl/yJ/QRy3jKaCWFRMvx58/Ae3Toea2aF68GAdLnYfyuxnNvqCpKAp86m47VYLeRPtsOSO6r
KkmvZh74AVfd0PDDJ99gqoT5FRqXNpzMeUAXNv5WHfqF+g8rAxhmz67buf9UdbF9YechYdLpd
+GAicsvivki2sTSd2+t5YJyUPQHczqBMSRs/dGR3Z/lC3fZ/+IvXJOIiaXfuY6N2PGOQJy0tF
+lux2WLdXJyiYK8abel8/IXDW6p9fLX0ihZnOXU45Xnm0kKoyt0wwxNEwQpOOUCeZWd1dHsQd
NsVDryXbdfGaKAWarWkFKV9X8eqlVSBWboHCPIje2zqDuIFj7xyK6rtKPij6ho7itoVkeSj3Y
DQF8SqZAG7ZmKtpyr2IujpY8sVgg/C/lrqSJXCfiAm4Yi9247ebyhgGapvFvivvigs3jiGaxO
vJNzk1HOK+PBEiAJYgPZweipVscfm6JgKdD8iLpvLQVyASMKeyydlrkdMsKHpiAWBoON99C0C
3y7Lh8K/oinxHsRskFSH6EX9IfPrH5WhRmB3j81OnvpfxUBFSTK71mX8wsfUNIhJ1RL/7NVx/
Stt65Byre8XxfbNeEtGAyS673F+biMbZARWW02WYfbwSNpVv2dG+PNtYG6gXCFE4vEnYt57Xk
UrTy23rRnz+wMHsjQcNcY1lLwG0Tcpd9U+3WTX5HBQ0AZNnVrT7AOqRU8OlAkAByFoPxfMo/N
DIoNb+fyVEmG8dEPyQ67G2l/vuCMx8XLuShu2gjQA4WAlGlyE5qFUwRhtOzXBZsLiN5NPL/fM
WCDspPrmZMWxu0iIFXlgTJIHHX6pWLP5E8ibBk7/PMMN7EC96AseP7JUAcoINlZeDehSaRJtt
8IOU3yaFPwvkSfa5oor5OF2dTNtxkoUjXItyZ+CxW3ze9rjekSzQlS76c8P71njUDWpc5qoC3
/JxnkANz6nBzTsPNALPBh/Fq5S2d9WxzDCnggAisNIaeicutbkdwNWDifrl6QZiDldazjv/cz
pJZRLBJOLiD83m35R3RcafycEuuSB2SVp/sdqP9e9Wjukj70THQx2Utz62stJLh5HYhUSHSZr
Sdqzama/5kdBk7vNK3j04ZVCIaa4F7LX8soLxIMjY/AiQ3dRv7F58hXtfI53ihsmcCwXT6RKY
QzAtKu7g24FjFfqSjwpSzHkRwUglUEjz/90dWhDPzC+zsITPNOv5bUFkdd+MYAtv3LIE43DkP
I0EHU5pNiuY1Io23Qw/GLNUQxkghTo9w3iGIyvdefBL/q/AChIy6AtjR6h/FeqPQnRT10HgEH
ZWJ3Hpx4OKyKINLbWBn/L95sOWEqq7qqq0I+KvUyhuCol+5SZTRaeCsApzzp9D2HvfwgNGo3n
c3IoNxmeJ3NQ887j3fe6vWUNEjdiStBtAgWsrzAen5MQQlY7a8YNnKlpb9DMhnIGTVI1HMRUV
lzaJeiEYNrhuHF/TMaeniltcgwx4gsMMPuJmpbsCT0XMNKJ3XqQklruYtdZR6Q==
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Thomas Wolff via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Thomas Wolff <towo AT towo DOT net>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 56OHkACn1546513

Am 24.07.2025 um 17:35 schrieb Corinna Vinschen:
> Thomas,
>
> On Jul 23 05:44, Thomas Wolff via Cygwin wrote:
>>>> Am 22.07.2025 um 15:05 schrieb Corinna Vinschen:
>>>>> mbrtowc() is inherently a bad idea when it comes to UTF-16. It's a
>>>>> function which only works really correctly for the unicode base plane,
>>>>> or if wchar_t is big enough.
>>>>>
>>>>> It's the reason we don't use mbrtowc() if possible.  It's better
>>>>> to call
>>>>> mbstowcs() or friends and allow at least 3 chars in the wchar_t buffer.
>>>>> You can't change that in mintty by any chance?
>>> [...]
>> OK, suppose I'd consider to switch to mbs[[n]r]towcs, collecting bytes until
>> the function gives me a result.
>> This would work fine as long as I receive only valid sequences. But look at
>> input string test case
>> char nonbmp[] = {0xF8, 0x88, 0x8A, 0xAF, 0x2D, 0}; // an invalid sequence
>> followed by a valid char
>> The functions only return -1 and (in the case of mbsnrtowcs) do not advance
>> the input pointer.
>> So how am I supposed to recognize that the invalid sequence has ended and a
>> valid character has arrived?
> Apart from that, you probably still have a problem in mintty: GB18030.
>
> The problem with GB18030 is, that you need all four bytes to generate
> the high surrogate.
>
> Consider the following GB18030 string: 0x90 0x30 0x81 0x30
>
> This string translates into a UTF-16 surrogate pair: 0xd800 0xdc00.
>
> If you run a tweaked version of your test applicaton from
> https://cygwin.com/pipermail/cygwin/2025-July/258513.html:
>
>    setlocale (LC_CTYPE, "zh_CN.gb18030");
>    mb (0x90);
>    mb (0x30);
>    mb (0x81);
>    mb (0x30);
>
> Then the output is:
>
>    90 -> 0000 : -2
>    30 -> 0000 : -2
>    81 -> 0000 : -2
>    30 -> D800 : 0
>
> However, if you notice this situation...
>
>    if (ret_from_mbrtowc == 0 && codeset == gb18030
>        && (pwc & 0xfc00) == 0xd800)
>
> ...you can just add a fake NUL byte:
>
>      mbrtowc (&wc, '\0', 1, &mbstate);
>
> If you do that, the above sequence becomes:
>
>    90 -> 0000 : -2
>    30 -> 0000 : -2
>    81 -> 0000 : -2
>    30 -> D800 : 0
>    00 -> DC00 : 1
>
> I hope this helps, if you didn't already handle GB18030 differently
> in mintty.
Oooff. No, I didn't. So that is already before 3.6.4 (and again 3.6.5), 
right?
Thanks for the notice, I'll check and test your workaround.
Thomas

> Corinna


-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019