DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 56MF6fqt4106535
Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com
Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com
DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 56MF6fqt4106535
Authentication-Results: delorie.com;
	dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=UbnEYZkT
X-Recipient: archive-cygwin@delorie.com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CE783385F025
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
	s=default; t=1753196799;
	bh=GENobLRIhfhnLSmKADJGysQulLuO9t97TXnnsGM8eW0=;
	h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
	 List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
	 From;
	b=UbnEYZkTmoYXdvYhLsp+lZ4NMG1SKdBvNuUc3DNsyJG2mfAi68EeumnNHmIhsA9OM
	 zFpk5UBtLXiX5qZ4u7i5RAK47CrSCOP7EcCmhApU+CQazU3hWN6p8CaEySXU1+k4LE
	 xl/4MWCuDoBdNcQBN7JJzaH2beHcQpn4IwUlnNpo=
X-Original-To: cygwin@cygwin.com
Delivered-To: cygwin@cygwin.com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BE60B3850842
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BE60B3850842
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1753196771; cv=none;
 b=Tx8nQHyBwcrDYtiB1+bYefPm4jaJyKfHbGSPXzxdNMVTrs+Ch1cTISVL3OxCtaI0Sdn4hlWv+AW9/KAM+oTJpMN03PHKa+JisU0QnExpIXEXt1foLxAemIel21/LAs+hBxbC9ohik4Ar6/lJT7C227I7pVUWvGlKZoobrkjIoow=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1753196771; c=relaxed/simple;
 bh=t1vlPGniw7B9nA1bXuY555HX5t2MHKfNgA3fTRj14wk=;
 h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From;
 b=FuRuZWFAUe9GE1si/mtUvXbE3Fs/oixVpow4ALsgKVLNPk9d3s/WXaRfqoVLYkHjaQg8pS/z8hGnGPrs2Ug3JHXo9L3YP3zIAPwNUJvEheMBLYudvpNDLzgFqP+tmg8bA8R5JLfQm2Q4ivXRMqaeh212rWkTJzGF58sANynwOac=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BE60B3850842
X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6
Message-ID: <91f26856-72b0-483b-8d04-bd90a27b6be0@towo.net>
Date: Tue, 22 Jul 2025 17:09:52 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: readdir() returns inaccessible name if file was created with
 invalid UTF-8
To: cygwin@cygwin.com
References: <96f2253b-791b-b8a0-97dd-8d257eefb9b1@t-online.de>
 <03c4fae7-7322-572c-ae72-52e300f0b438@t-online.de>
 <aFxRfI4NdZ8y5IlK@calimero.vinschen.de>
 <f78c615c-aefe-b3d0-aada-5f9d0cf73a0a@t-online.de>
 <aF5y15iQ840LxLYJ@calimero.vinschen.de>
 <ca205dbd-907f-4552-9e5c-2cb0050f83a3@towo.net>
 <aH-MtwqARmDmLwoo@calimero.vinschen.de>
Autocrypt: addr=towo@towo.net; keydata=
 xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu
 KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ
 bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK
 fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11
 Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1
 zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN
 BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m
 7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB
 BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID
 AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9
 +AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve
 5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY
 GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72
 5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B
 I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4
 uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka
 UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded
 KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc
 k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te
 D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD
 G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ
 0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk
 UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P
 agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA
 IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF
 mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW
 6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F
 PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa
 QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF
 FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP
 y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP
 iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ==
In-Reply-To: <aH-MtwqARmDmLwoo@calimero.vinschen.de>
X-Provags-ID: V03:K1:fGCvEfc9PEbCqgqouFB/F/sd39Xl9bcGXU6+XG9xEMu5qfyd56/
 Scf+t0Gf8UpHbEM3djuBQN8NvDkJ3DiH0Qe9EM+dmxp8HBFVfreuIFxINZ9sTKth2Tc0rWp
 ywllw6CKxTUY3eg+iNryaCt9U2Dkxc9V9G3JDwtGQ0rPspAUJI6Ewjeg9d0ZX7UKdOc8uEU
 v6A0PRyqqaSSl9ycTkv3g==
UI-OutboundReport: notjunk:1;M01:P0:rwhAydkqpqI=;wPr6rllDRAeDIp9xbaRh8wJmlcc
 jWKzAQrjWPwsB+dXMbgmkE33GS3pcSSG+6AgMnU4aQ4aeXy+vm4LpzQbaXAmpnTBJcvTLoHGZ
 +JDKjfOo82j+4rS21fdu2iz/O01ravwctqspIGOowLMegGroqO8dXqdM1uekIjk0u83j7LObA
 4hiF53A2UNk4RNBJs0SOzRS/ECXTxqZyI4FPSW4FZS+Hl3avf0p6MSn2aUQIOCScVWIK2zdzK
 JaPUcVMpGWjoytFjOxsI+lQsuKUUdrnqimU9ys3jnKu2ptE5sa30ikewExohXAL0jD08Svn8F
 /8WaBLZlTlfITy2Gx2KCjiQ1kPzMWR0bZWyGJTS2HOx5kqWQOv8v27qtxOVOJrWpYFT6sYqGI
 pnUDrAWU01z1WFyfOd4E6awgnu+EOQVrof4nkGXfugH1pgSm5EvCT2AreTe5UkBxm9Rb1wc3n
 KSQRj6dDCS+du7TJUD2wGkO2d16PkI+7s4Si3h4yNhGT7EWW/4Co0ETudtcKd19nfIrNdj96j
 wsUJeiqn/dEBF9prfHd6qKPeX6c9q0ioyJLTXZ4aeOVcB31Uj/QCtMMjGpjWA8lHHtX84UKKT
 v/gMnUdrD/UyG8u4ZLj6gE5wwoaum/9P2c82/2w81h0Zl3RSZ7BUj6biLOMKaKZiHBcH2SlaE
 8TxmazaY/m9kthYCFUhm01nSzDbk/G1YZ4GFyT7KhSz5jqVfu1F2ASgVhK/QXpkITdo5iAbCT
 fG7ACgmDpMrddOrZPEwSbTxBeDrtdwhTI1vHHElwjufxRp80Jwei/2gkKwsKkETua24yw+3yQ
 suDIp0IeQ2AX5IMKCRUyWpeXE2JkRT0I2YVzUzhS8N98GWHz+4xNQQkyzjOH76bEcHSKe6b2o
 ARmVNKXGmRbJRSACcsbW1znczNRf6UmUFUMyozR0/gqmAHvFiews+mRKju/yHc9bARGPBEQ2M
 xJSDAcX7cS9QYZSIBNwKspzpAaXRMFdgZClRfiEkZ7B48IsyoIrLdKkUrFW5ZytPbiG4VJZ5H
 UhzbpeFhyBirWjnBWS7Zyow0UoLoGQmiZ6iBsrQWOcfmmFZv82BYQ1aVDZZsj98emf8b2dDRQ
 mJ0cymNmvzwTUaBha6NhXOu4QQpIzDOPQAtM/FHf+poHs+Bqq7riDb7IINNXCSQrq3JGnZpKm
 cuuP7IqQuRj7PiB/L/V9w8YqWhQTjarB9AMrnAqz8aX4PbSvoZwAWbQ43VcK2Y8gmK7prVvDg
 ybD4ILGDt9ubLgSi4uBwg9YJfMFX9ujZ9syU1+JAeuEUxNb1+5hkCxMkCn3yaKBQBj9464gPf
 AMYHcN0KK6Cu8DHUVl8sr72yinSR6/kBiFPXX4rDf+E8Xw6+Q9Rhe4qhAjun5w2l6mrgeFjfW
 8TAYYEd9HeOtssFbBQAoGtl1609Ys8P6O0ZWPFGo1KFBnIhg33ZOXsDIyqBP9V7FIFZ8F2LLO
 c3fLqYRXQ+jtuqlK2kqRcjpXwyKUVww2vyhDMWauY6rWPxDaWQhBxA3MS+v7byEgzRLgAOueg
 EKATtK8//VI5yxG3FEGCSi0ZTquGA5LNLnYt3EVkQMuV71QGXIlHM5uZ2pOlPavenMNlo8p37
 De6SBEzraI=
X-BeenThere: cygwin@cygwin.com
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-request@cygwin.com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=subscribe>
From: Thomas Wolff via Cygwin <cygwin@cygwin.com>
Reply-To: Thomas Wolff <towo@towo.net>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Errors-To: cygwin-bounces~archive-cygwin=delorie.com@cygwin.com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie.com@cygwin.com>



Am 22.07.2025 um 15:05 schrieb Corinna Vinschen:
> On Jul 22 05:38, Thomas Wolff via Cygwin wrote:
>> Am 27.06.2025 um 12:30 schrieb Corinna Vinschen via Cygwin:
>>> On Jun 26 19:07, Christian Franke via Cygwin wrote:
>>>> With some trial and error I found a testcase for this more serious problem
>>>> reported yesterday but not quoted above:
>>>>
>>>>>> In cases like file3-... above, the converted Windows path ends with
>>>>>> 0xF000. This suggests that this is an accidental conversion of the
>>>>>> terminating null to the 0xF0xx range.
>>>>>>
>>>>>> In some cases, the created Windows file name has random garbage
>>>>>> behind the 0xF000. Then even Cygwin is not able to access or unlink
>>>>>> the file after creation.
>>>> Testcase (attached):
>>> Thanks for the testcase!
>>>
>>> I found the problem in the newlib core function creating wchar_t from
>>> UTF-8 input.  In case of 4 byte UTF-8 sequences, the code created the
>>> low surrogate already after reading byte 3, without checking if byte 4
>>> of the UTF-8 sequence is a valid byte. Hilarity ensues.
>> I'm afraid the fix may have broken mbrtowc as I just reported to the list,
>> with a test case, thus also breaking mintty.
>> The low surrogate MUST be created after byte 3 because otherwise the high
>> surrogate cannot be delivered after byte 4 as it needs to.
>> I think it's a drawback of UTF-16 that must be swallowed, even if some
>> incorrect sequences slip through somehow.
> Bummer.  What bugs me most is that you might be right here.  It's a bit
> late, but we should have defined wchar_t as a 4 byte type back when we
> worked on Cygwin 1.7.0... sigh.
>
> mbrtowc() is inherently a bad idea when it comes to UTF-16.  It's a
> function which only works really correctly for the unicode base plane,
> or if wchar_t is big enough.
>
> It's the reason we don't use mbrtowc() if possible.  It's better to call
> mbstowcs() or friends and allow at least 3 chars in the wchar_t buffer.
> You can't change that in mintty by any chance?
Well, I've started to think about a workaround but it's code I've never 
touched before and I'd need to carefully ponder about all kinds of 
possible special situations, so my testing effort would be high. Also, 
I'd need to implement bytewise mbr collection which is right now done by 
that function.
Since not using mbrtowc anymore would leave it still broken (and what 
other software may fall into that trap...), I'd prefer a fix of that 
function anyway.

Thomas

> Corinna


-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple
