delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2025/07/22/11:06:42

DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 56MF6fqt4106535
Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com
Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com
DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 56MF6fqt4106535
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=UbnEYZkT
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CE783385F025
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1753196799;
bh=GENobLRIhfhnLSmKADJGysQulLuO9t97TXnnsGM8eW0=;
h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=UbnEYZkTmoYXdvYhLsp+lZ4NMG1SKdBvNuUc3DNsyJG2mfAi68EeumnNHmIhsA9OM
zFpk5UBtLXiX5qZ4u7i5RAK47CrSCOP7EcCmhApU+CQazU3hWN6p8CaEySXU1+k4LE
xl/4MWCuDoBdNcQBN7JJzaH2beHcQpn4IwUlnNpo=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BE60B3850842
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BE60B3850842
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1753196771; cv=none;
b=Tx8nQHyBwcrDYtiB1+bYefPm4jaJyKfHbGSPXzxdNMVTrs+Ch1cTISVL3OxCtaI0Sdn4hlWv+AW9/KAM+oTJpMN03PHKa+JisU0QnExpIXEXt1foLxAemIel21/LAs+hBxbC9ohik4Ar6/lJT7C227I7pVUWvGlKZoobrkjIoow=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1753196771; c=relaxed/simple;
bh=t1vlPGniw7B9nA1bXuY555HX5t2MHKfNgA3fTRj14wk=;
h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From;
b=FuRuZWFAUe9GE1si/mtUvXbE3Fs/oixVpow4ALsgKVLNPk9d3s/WXaRfqoVLYkHjaQg8pS/z8hGnGPrs2Ug3JHXo9L3YP3zIAPwNUJvEheMBLYudvpNDLzgFqP+tmg8bA8R5JLfQm2Q4ivXRMqaeh212rWkTJzGF58sANynwOac=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BE60B3850842
X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6
Message-ID: <91f26856-72b0-483b-8d04-bd90a27b6be0@towo.net>
Date: Tue, 22 Jul 2025 17:09:52 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: readdir() returns inaccessible name if file was created with
invalid UTF-8
To: cygwin AT cygwin DOT com
References: <96f2253b-791b-b8a0-97dd-8d257eefb9b1 AT t-online DOT de>
<03c4fae7-7322-572c-ae72-52e300f0b438 AT t-online DOT de>
<aFxRfI4NdZ8y5IlK AT calimero DOT vinschen DOT de>
<f78c615c-aefe-b3d0-aada-5f9d0cf73a0a AT t-online DOT de>
<aF5y15iQ840LxLYJ AT calimero DOT vinschen DOT de>
<ca205dbd-907f-4552-9e5c-2cb0050f83a3 AT towo DOT net>
<aH-MtwqARmDmLwoo AT calimero DOT vinschen DOT de>
Autocrypt: addr=towo AT towo DOT net; keydata=
xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu
KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ
bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK
fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11
Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1
zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN
BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m
7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB
BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID
AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9
+AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve
5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY
GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72
5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B
I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4
uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka
UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded
KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc
k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te
D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD
G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ
0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk
UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P
agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA
IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF
mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW
6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F
PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa
QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF
FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP
y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP
iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ==
In-Reply-To: <aH-MtwqARmDmLwoo@calimero.vinschen.de>
X-Provags-ID: V03:K1:fGCvEfc9PEbCqgqouFB/F/sd39Xl9bcGXU6+XG9xEMu5qfyd56/
Scf+t0Gf8UpHbEM3djuBQN8NvDkJ3DiH0Qe9EM+dmxp8HBFVfreuIFxINZ9sTKth2Tc0rWp
ywllw6CKxTUY3eg+iNryaCt9U2Dkxc9V9G3JDwtGQ0rPspAUJI6Ewjeg9d0ZX7UKdOc8uEU
v6A0PRyqqaSSl9ycTkv3g==
UI-OutboundReport: notjunk:1;M01:P0:rwhAydkqpqI=;wPr6rllDRAeDIp9xbaRh8wJmlcc
jWKzAQrjWPwsB+dXMbgmkE33GS3pcSSG+6AgMnU4aQ4aeXy+vm4LpzQbaXAmpnTBJcvTLoHGZ
+JDKjfOo82j+4rS21fdu2iz/O01ravwctqspIGOowLMegGroqO8dXqdM1uekIjk0u83j7LObA
4hiF53A2UNk4RNBJs0SOzRS/ECXTxqZyI4FPSW4FZS+Hl3avf0p6MSn2aUQIOCScVWIK2zdzK
JaPUcVMpGWjoytFjOxsI+lQsuKUUdrnqimU9ys3jnKu2ptE5sa30ikewExohXAL0jD08Svn8F
/8WaBLZlTlfITy2Gx2KCjiQ1kPzMWR0bZWyGJTS2HOx5kqWQOv8v27qtxOVOJrWpYFT6sYqGI
pnUDrAWU01z1WFyfOd4E6awgnu+EOQVrof4nkGXfugH1pgSm5EvCT2AreTe5UkBxm9Rb1wc3n
KSQRj6dDCS+du7TJUD2wGkO2d16PkI+7s4Si3h4yNhGT7EWW/4Co0ETudtcKd19nfIrNdj96j
wsUJeiqn/dEBF9prfHd6qKPeX6c9q0ioyJLTXZ4aeOVcB31Uj/QCtMMjGpjWA8lHHtX84UKKT
v/gMnUdrD/UyG8u4ZLj6gE5wwoaum/9P2c82/2w81h0Zl3RSZ7BUj6biLOMKaKZiHBcH2SlaE
8TxmazaY/m9kthYCFUhm01nSzDbk/G1YZ4GFyT7KhSz5jqVfu1F2ASgVhK/QXpkITdo5iAbCT
fG7ACgmDpMrddOrZPEwSbTxBeDrtdwhTI1vHHElwjufxRp80Jwei/2gkKwsKkETua24yw+3yQ
suDIp0IeQ2AX5IMKCRUyWpeXE2JkRT0I2YVzUzhS8N98GWHz+4xNQQkyzjOH76bEcHSKe6b2o
ARmVNKXGmRbJRSACcsbW1znczNRf6UmUFUMyozR0/gqmAHvFiews+mRKju/yHc9bARGPBEQ2M
xJSDAcX7cS9QYZSIBNwKspzpAaXRMFdgZClRfiEkZ7B48IsyoIrLdKkUrFW5ZytPbiG4VJZ5H
UhzbpeFhyBirWjnBWS7Zyow0UoLoGQmiZ6iBsrQWOcfmmFZv82BYQ1aVDZZsj98emf8b2dDRQ
mJ0cymNmvzwTUaBha6NhXOu4QQpIzDOPQAtM/FHf+poHs+Bqq7riDb7IINNXCSQrq3JGnZpKm
cuuP7IqQuRj7PiB/L/V9w8YqWhQTjarB9AMrnAqz8aX4PbSvoZwAWbQ43VcK2Y8gmK7prVvDg
ybD4ILGDt9ubLgSi4uBwg9YJfMFX9ujZ9syU1+JAeuEUxNb1+5hkCxMkCn3yaKBQBj9464gPf
AMYHcN0KK6Cu8DHUVl8sr72yinSR6/kBiFPXX4rDf+E8Xw6+Q9Rhe4qhAjun5w2l6mrgeFjfW
8TAYYEd9HeOtssFbBQAoGtl1609Ys8P6O0ZWPFGo1KFBnIhg33ZOXsDIyqBP9V7FIFZ8F2LLO
c3fLqYRXQ+jtuqlK2kqRcjpXwyKUVww2vyhDMWauY6rWPxDaWQhBxA3MS+v7byEgzRLgAOueg
EKATtK8//VI5yxG3FEGCSi0ZTquGA5LNLnYt3EVkQMuV71QGXIlHM5uZ2pOlPavenMNlo8p37
De6SBEzraI=
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Thomas Wolff via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Thomas Wolff <towo AT towo DOT net>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>


Am 22.07.2025 um 15:05 schrieb Corinna Vinschen:
> On Jul 22 05:38, Thomas Wolff via Cygwin wrote:
>> Am 27.06.2025 um 12:30 schrieb Corinna Vinschen via Cygwin:
>>> On Jun 26 19:07, Christian Franke via Cygwin wrote:
>>>> With some trial and error I found a testcase for this more serious problem
>>>> reported yesterday but not quoted above:
>>>>
>>>>>> In cases like file3-... above, the converted Windows path ends with
>>>>>> 0xF000. This suggests that this is an accidental conversion of the
>>>>>> terminating null to the 0xF0xx range.
>>>>>>
>>>>>> In some cases, the created Windows file name has random garbage
>>>>>> behind the 0xF000. Then even Cygwin is not able to access or unlink
>>>>>> the file after creation.
>>>> Testcase (attached):
>>> Thanks for the testcase!
>>>
>>> I found the problem in the newlib core function creating wchar_t from
>>> UTF-8 input.  In case of 4 byte UTF-8 sequences, the code created the
>>> low surrogate already after reading byte 3, without checking if byte 4
>>> of the UTF-8 sequence is a valid byte. Hilarity ensues.
>> I'm afraid the fix may have broken mbrtowc as I just reported to the list,
>> with a test case, thus also breaking mintty.
>> The low surrogate MUST be created after byte 3 because otherwise the high
>> surrogate cannot be delivered after byte 4 as it needs to.
>> I think it's a drawback of UTF-16 that must be swallowed, even if some
>> incorrect sequences slip through somehow.
> Bummer.  What bugs me most is that you might be right here.  It's a bit
> late, but we should have defined wchar_t as a 4 byte type back when we
> worked on Cygwin 1.7.0... sigh.
>
> mbrtowc() is inherently a bad idea when it comes to UTF-16.  It's a
> function which only works really correctly for the unicode base plane,
> or if wchar_t is big enough.
>
> It's the reason we don't use mbrtowc() if possible.  It's better to call
> mbstowcs() or friends and allow at least 3 chars in the wchar_t buffer.
> You can't change that in mintty by any chance?
Well, I've started to think about a workaround but it's code I've never 
touched before and I'd need to carefully ponder about all kinds of 
possible special situations, so my testing effort would be high. Also, 
I'd need to implement bytewise mbr collection which is right now done by 
that function.
Since not using mbrtowc anymore would leave it still broken (and what 
other software may fall into that trap...), I'd prefer a fix of that 
function anyway.

Thomas

> Corinna


-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019