DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 56NFsPRY820565 Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 56NFsPRY820565 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=EOe3bBMm X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8F75B3858401 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1753286062; bh=qXs81VZsWSL9D5FSYQd2mR0FV4ckA4QZyk7DwqcFHqg=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=EOe3bBMm3tN1eSuuIS+bfoqt+UFvNMlOc4YyyevS4xLsNpfPVg4Gi5EslaGAS8jFH ZAVoztjcLzyJfB3OE4Tt0jj2RJa2TpNtT+fkVe2P5qJtpP6nzPxQcfRNHQTaHwTW7g UXu7FADchf9VGYwxmpKL1Jf2AQA6WWnveH8ufFcc= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C9F413858D1E ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C9F413858D1E ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1753285867; cv=none; b=BNu87aRl9FHmfEMwSyf2gqFeECXy2XvvcpbobGjYF1OIjhhWAiKAvDXIb9timkL5cLnJYxSswoavzJd/8yCG8Iwn10K+jmQ3mNBTGlRmCNwXLW/+pKpd+2qXp9Ay/ySEjXAGzTaRp5fpLCTVtCKJGkW3eimGMjur0bvm2fEacUo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1753285867; c=relaxed/simple; bh=pPDVBN+cyOcfnXriF8NhMw7NmbJmfhXevehg2e6cegk=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=LxADuDWSnbwW5tNCBEFTjks7/QlQie09C422DqVn/R3GjHn6rXllkgBON3G4ksjTtwMewkChPfW4xZkzNN2YtcuCBEfAQb87yOuixTMul7OXabH5ESNgiqTNtXob2aUqnIoGKayrDVv3nnUKKboIB8wluaBYt/CaRNRXjqOKCdw= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C9F413858D1E X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6 Message-ID: <11282182-60d1-4841-bf78-5ef78cf30060@towo.net> Date: Wed, 23 Jul 2025 17:50:31 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: readdir() returns inaccessible name if file was created with invalid UTF-8 To: cygwin AT cygwin DOT com References: <96f2253b-791b-b8a0-97dd-8d257eefb9b1 AT t-online DOT de> <03c4fae7-7322-572c-ae72-52e300f0b438 AT t-online DOT de> <91f26856-72b0-483b-8d04-bd90a27b6be0 AT towo DOT net> <4ab2c1b7-3164-4556-ba36-29814ecf5766 AT towo DOT net> <68f65634-8f4e-436b-ba6a-d30bdf882aaa AT towo DOT net> Autocrypt: addr=towo AT towo DOT net; keydata= xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11 Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1 zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m 7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9 +AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve 5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72 5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4 uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ 0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW 6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ== In-Reply-To: X-Provags-ID: V03:K1:k3FHhAhLjvrhzEYKqRzbXGmzrF1sCfh9lAQfB48Cfpf9x0oJHUj BJWxN08tP7pdDMGUrd08AfX/luQ/Et7tNAdWVbrIi69xhFpySm37yxaMXOTQAfDXxZQcrdn LivskS8LbL4eKKSiLG3REr8Oo6NzHRBtf8l9oYkXyLiGuTX2zbs2wt55HIJhwwUytSqY/Pa k7D8FN5izISLkpAOLOPEw== UI-OutboundReport: notjunk:1;M01:P0:B0APpgweNh0=;7W8MNGnZYCBltwaCxjKI+khKN0R AGISVnfaeaSkKPZOeijSnRxdkdXnSlF59T803gm87oPTRe1fgL6tccRFG0aZSDI66GZBYE4vc +Kmn8Ec/Hgzk0ncVUGLn5wh1SHiIiZOspVZ1O3Ri8WYONZE0Z/jcLX/xaj5WWGJR3NgXwt5M9 4dewmD9r1/N3r6sX19FxVof5k3KY5Hy2yJImq9jKEu3RBUGj0p9TC6Jhr7pwHOKtkUXjrCLTs apnCqeafzLubbEhTQ2fRZJlT5tPlBx/ERAbNMjl7D6whhE2W0PbiJAFH5CLStWgaC8om5zKLT 1Q2Y6+XP+bfoJ02wZfGdVomdI1eV2LXLDrQX1zi0ujZ7KsQ86lrwEECIxMFOk2QTMNOgmNoVF UTlXdc2KEFCWEvCXuMQntwWUe28Bvn9WSOBBOi2L+PlSpypQtz+xanlkvHIuPz37CmGTBdfe9 BEfDmjcCgsk+X6tUsM55pjKLjirSb7aUOvrTqw166acUaukXjCwhecpNVJ0eW1r30Jt6HfjpS Ak2PyppZtM+ibN6tr1SujwYw/BX/J2hFkMK6RdKA8PcDB6bdRU54QGJGanqNqjMW7Zh9WtPWS pJg8GquVeDK7E2mpVkAHeHU88flpKhBwQBaml7cN7kV0rflanZWp/Tm1HbV0KD6ISh15HJwOT 2CaoXrNhDWk8BJ6II5/HQCa+yQllr/7wLxfH6/aLzAG7V6c54U8Ij4hki0Zf5SX7uQ/o5yv8j oG9wjslAD+qxFEouEIpvTOnQsaAVZ9nU9VeBGTdRnWvxYFxGuQAZXHE8sjhIqiO0dWK6cAEc3 Sak79WmbD4f54aUDOCeV9LxYaVJ5p+Gad9XYLo3CI0cBS9V4W5MrV89VMLlf31jtm0kEk4XE5 a37W0fVnGf5MGBMuI6nY1msewGwgi7WQ8tw4aJEsaZjefeaIwgiCfFedFGWq3QZVqvFofiI7P YxMS30etzhmIrltvpKe620SDrSbviBZQYmxOAjxloU/5dZ3iPGpc0nhib8tCaluLUSQSi3s2N jmHj3w/DI5iG8xed3krA65TFG4kRAgnoDUETcTGWXRzObhPJ9QwxKhAXLj8u96pUROZkVSbTR 6NdkatS7cqvqMQCy2CNiI8ckK9pB1CE6HyEqpBaDHDMbbLT2kTa+UHZGmUdPR/PiZrSJM58Kp Bl+AzoYGv8YJ20UWqmY7+vQDhTCRi0MqhaVkrxc8d7t3fu1Kpu5E2kTU7SrKc2Wg2OAj078HG SJ3RcdubrvAhjsNr3c92tkHxqK7rui92If2NC5VqLa5TaY1Mk/f2Sh+fv8pzbnJg8vhIzlrdz OuPInVMiGKlVtuflG5qVeJNm1QhU7wqva7A6pu0pACJdadNPz8VGc0mAVZbqyTA/wE+vpVN6N MNwnn4j3bhTXAAS58VmcbpwcnrWSUIVKY1aFMM2q+idg7vZA3tgJnf/iazObLccit6vK8Emv2 g6h4Q0quX03PyeFe8/9+NTdEQiC/+A7BphWyWDs1SJexC1c69PFuGJ3b4YYR8725yaHvxk9GJ /s6wJxWsFp+9xULQ9uo3JNFdcBkpVdgg/4K4G/xjkBFnYltJePHnnjD3giYF7w== X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Thomas Wolff via Cygwin Reply-To: Thomas Wolff Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Am 23.07.2025 um 09:53 schrieb Corinna Vinschen via Cygwin: > On Jul 23 05:44, Thomas Wolff via Cygwin wrote: >> OK, suppose I'd consider to switch to mbs[[n]r]towcs, collecting bytes until >> the function gives me a result. >> This would work fine as long as I receive only valid sequences. But look at >> input string test case >> char nonbmp[] = {0xF8, 0x88, 0x8A, 0xAF, 0x2D, 0}; // an invalid sequence >> followed by a valid char >> The functions only return -1 and (in the case of mbsnrtowcs) do not advance >> the input pointer. >> So how am I supposed to recognize that the invalid sequence has ended and a >> valid character has arrived? > Yeah, I see the problem. One of the slightly puzzeling behaviours > of mbsnrtowcs is the fact that the src pointer stays at the start of > the invalid sequence. I think the idea is to skip the invalid sequence > byte-wise until wcsnrtombs reports a valid sequence again. But an incomplete sequence could still be completed to become a valid sequence... So I could check a maximum length of bytes, say, with high bit set. Not sure whether that works for other multibyte encodings, certainly not for GB18030. > What bugs me is that we have the choice between a broken mbrtowc on > one side and a chance to generate broken filenames on the other side. I did not look into those details, but while characters to be handled by a terminal come sequentially as a stream, filenames can be handled as a compound string, isn't that easier to check? > I think we should actually revert fa272e05bbd0 ("wcstombs: also call > __WCTOMB on terminating NUL if output buffer is NULL") and see if we can > fix the filename issue in the Cygwin functions for filename conversion > alone. > > Any ideas appreciated. > > > Corinna > -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple