DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 56N2Q67N338956 Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 56N2Q67N338956 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=N43+H1PY X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B76033858C42 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1753237564; bh=0JhMLws5MkzD1tqOCq256O+BYSBUlVhf6rTPw1rqlU4=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=N43+H1PYZjtm4K7mMvDI8BgAq6OMLHt756Oc94KPF/Il/ZUsFi/dWN8kgjLLh0mM9 Lx3bfW2aENiPRCZO3RKgis3dN7rcCQh5xw2JqVJUp5uNjGQJIi1258H51zuoSnPgrz hr0ajC4aHxFKRVEHynX5oLydSIepF20QA/cbebwc= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A471E3858D1E ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A471E3858D1E ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1753237538; cv=none; b=jVleshgz1fpNRjNhTz4tuj4qkYPlOGyb39i/UlsqIhAOnVaLpnTaTLTE+RjtRcdkpyu3SJmkky+Kmjz4XeNi05Di+OEKg77kkxP2uqjrI7L7PNXLkuuXvDzVP6CueCVjuDUH+gixH7jvJJMNyFZ2IYjwpxLhTp3JzoeI6CwlLuA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1753237538; c=relaxed/simple; bh=th8Sa7ec8SlPa13CdLzwfjFSjrb12aZuFM9b2Yd2fXg=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=vnOsmozb8wUm0nmcsHiiv1uFLTawSCm1vNBvMf1WlvHqnAaxyWlRaW6eWbscSExcsH/BkF/QyWxp7E/kc45TGEld9odVjH0fG13Fy0WKd/s6Y3s+w9gvZ5rbiCKfoEBfhDJ7p8upIDYH1l2YxqmFhnYqO9pK/fW6tfYG50hi6Wc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A471E3858D1E X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6 Message-ID: <4ab2c1b7-3164-4556-ba36-29814ecf5766@towo.net> Date: Wed, 23 Jul 2025 04:25:36 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: readdir() returns inaccessible name if file was created with invalid UTF-8 To: cygwin AT cygwin DOT com References: <96f2253b-791b-b8a0-97dd-8d257eefb9b1 AT t-online DOT de> <03c4fae7-7322-572c-ae72-52e300f0b438 AT t-online DOT de> <91f26856-72b0-483b-8d04-bd90a27b6be0 AT towo DOT net> Autocrypt: addr=towo AT towo DOT net; keydata= xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11 Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1 zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m 7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9 +AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve 5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72 5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4 uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ 0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW 6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ== In-Reply-To: <91f26856-72b0-483b-8d04-bd90a27b6be0@towo.net> X-Provags-ID: V03:K1:5Qd3+KjYrY1eDdywJ2FfjpvUTOx74PeTGHjhr3G5ZTkCWvU5CP7 2gfF2ARnhzyUo8s5fewPgdf8QEO+4AykkcssC554ItYvtkhX2XKQb/Rlh4jLARRfe+QwOo3 eWVOMwNweoutV+aEMc46Tn88wtSO/Gzaajt3N4B8OSPzCfibcEC8lZmvnwOJmu2kt+EzUmj wDEN/JAeV17NYqFZSPhQg== UI-OutboundReport: notjunk:1;M01:P0:NqOkoLkgxNk=;BSvf67zMeiKIj2FskHA9XY8uDhA yLxWBBVZ6++Ru2TANe7gStUoGor97H4pS9aXsKcoygvqvj2Y4wDCxKYcc7/8/qosjxi5TjVe7 t10h41u6B7a3d7Sh2pDH4CoYrolNWI5KaaEqLg7GKAi1A6/Y5bNZD+zIfGP4seqyAZGyUNWzQ nq+cDl55Rszgd9ljaCBF2yXyJ4wzP3a+ZR7Fl8zKzJAayYJ0kScy+6QDvQZm5KW8FovFmyzOk Mn+ZJueHRr88JJx7g49+u0xvmOEjubHs05LIJjB2/VQ0r7KLRpM/3LEQTNo3Q+6KkHR7JXoAV CKbQeeQoXpi+MqwOeg0YMDcH5WWDKJteTyRFPbCHxP3pHTunyFqCR4aNfNzdme2yE9zu88WRz ue2dU2T87zSu/udIaC71v0kceZlI5JB3utwUS1xN7KXHfjAb/NMnuicW3NKTpXfH+hmujblfb RFPf8Q1BXqHeimXgBXGXMLJq8Dc0wo8k++kNqteAlplBpKHm2cxbouK2Ew2Ql3jntkBMnQX3L vcspnYAk49PAReXmreb4novTKjAo+SOQwQUV6ysFau7AUYpRU2hM5ImzQwb29GDgM9kJb9vfE NCYwx3bP89ZZD++OZ8W2cwSYAyz++0B5Z/7PFGRFFfwwvXUkGTZaVw/A+91gB6G0O2cA4Fdrw 3BjeWDEMV3vSASAGE1Hh9B5S9BBdzzrYr7TI5qEu0xSmTfc6PcATteccmK6936mtWGj3XwjRN PEcKdyc5l4UN9+DfnUiDFIE2/R8rHLVmwMUwLDoSgsiicVvlTW0qk9DE9qY3MHz85mmkpbEQB sAWDsLcVupk4SSR1AqONUb66GHLyDKv8dTFrsbXTUpvERGiZoDYWEjhZ9W3VXDOSErEljjhB/ 1eZsN6+xYNQd5ZkusVHckKe0xE79ce+X57glFM6DdOYPtK0fS4c8R7juFX+MkZFzkHGWVhQSr FTMWoq9NYJczoRJYQ3QKx3FmE9AEaWrLiUe9+vI6n7m75cd7FqIdoyuHYw28A4BnoFaFujmIe zIFfdBjLUJNE/NZmqteF5adaLJdV/jVcrBDUlepwgqhssf3jd8xTVLwJf2HCpCVFIs64pV1K1 LhKCLxljOZR6t6Q3Iu9fV3cLft15RlM3KXI0iwmLiGvmwxcmljm1OcG5Qf1SrwXx7WLXI/UGG frHhWmNB6gU1/gMskQgGg9lrFQxi0Y7n0rcoP4doysiamIxrg1Izk8UTWk5gv1iKh5HdzrWYp uqlThMckisGs8OYaxCWANIC279fybOdiIjhQIdndT2RVNVvTBhf7Oz9qbvaAZqwKPuDUy63ix TNhKFucpfJnvdKVk6gTP80iJRKnqTuhKPoBC0a/qKO/ZADZtNgKmRSAxWeqbX96oTJZeh1ryv 2u2VOpbNYHucgKKx20MPvsqSNuophe3XZUF64NbDaoq9mv4UBoF8v9fXkeM9Lg7LGzfXeSrfr xOvz0XkQUozFGej629X3aWxbioqXrvFkSNQkJiBV6SLBxQIv9hF42EbO3Y2ZH6RI1T8Z1A+1h EvOcLhlU6KHLhT9Twd5/jm2tlewV1FpZHprbc7nDWl9XiknH7MSmjSNVd2xk8g== X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Thomas Wolff via Cygwin Reply-To: Thomas Wolff Content-Type: text/plain; charset="utf-8"; Format="flowed" Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 56N2Q67N338956 Am 22.07.2025 um 17:09 schrieb Thomas Wolff via Cygwin: > > > Am 22.07.2025 um 15:05 schrieb Corinna Vinschen: >> On Jul 22 05:38, Thomas Wolff via Cygwin wrote: >>> Am 27.06.2025 um 12:30 schrieb Corinna Vinschen via Cygwin: >>>> On Jun 26 19:07, Christian Franke via Cygwin wrote: >>>>> With some trial and error I found a testcase for this more serious >>>>> problem >>>>> reported yesterday but not quoted above: >>>>> >>>>>>> In cases like file3-... above, the converted Windows path ends with >>>>>>> 0xF000. This suggests that this is an accidental conversion of the >>>>>>> terminating null to the 0xF0xx range. >>>>>>> >>>>>>> In some cases, the created Windows file name has random garbage >>>>>>> behind the 0xF000. Then even Cygwin is not able to access or unlink >>>>>>> the file after creation. >>>>> Testcase (attached): >>>> Thanks for the testcase! >>>> >>>> I found the problem in the newlib core function creating wchar_t from >>>> UTF-8 input.  In case of 4 byte UTF-8 sequences, the code created the >>>> low surrogate already after reading byte 3, without checking if byte 4 >>>> of the UTF-8 sequence is a valid byte. Hilarity ensues. >>> I'm afraid the fix may have broken mbrtowc as I just reported to the >>> list, >>> with a test case, thus also breaking mintty. >>> The low surrogate MUST be created after byte 3 because otherwise the >>> high >>> surrogate cannot be delivered after byte 4 as it needs to. >>> I think it's a drawback of UTF-16 that must be swallowed, even if some >>> incorrect sequences slip through somehow. >> Bummer.  What bugs me most is that you might be right here. It's a bit >> late, but we should have defined wchar_t as a 4 byte type back when we >> worked on Cygwin 1.7.0... sigh. >> >> mbrtowc() is inherently a bad idea when it comes to UTF-16. It's a >> function which only works really correctly for the unicode base plane, >> or if wchar_t is big enough. >> >> It's the reason we don't use mbrtowc() if possible.  It's better to call >> mbstowcs() or friends and allow at least 3 chars in the wchar_t buffer. >> You can't change that in mintty by any chance? > Well, I've started to think about a workaround but it's code I've > never touched before and I'd need to carefully ponder about all kinds > of possible special situations, so my testing effort would be high. > Also, I'd need to implement bytewise mbr collection which is right now > done by that function. > Since not using mbrtowc anymore would leave it still broken (and what > other software may fall into that trap...), I'd prefer a fix of that > function anyway. I've checked whether to use the old version of mbrtowc from newlib directly in mintty but it pulls too many dependencies... I've also checked whether to use _mbrtowc_r instead which is defined in wchar.h but it does not link. By the way, discussion and commit log mix up the order: the high surrogate comes first. > > Thomas > >> Corinna > > -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple