| delorie.com/archives/browse.cgi | search |
| DMARC-Filter: | OpenDMARC Filter v1.4.2 delorie.com 56N7s6Fq548853 |
| Authentication-Results: | delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com |
| Authentication-Results: | delorie.com; spf=pass smtp.mailfrom=cygwin.com |
| DKIM-Filter: | OpenDKIM Filter v2.11.0 delorie.com 56N7s6Fq548853 |
| Authentication-Results: | delorie.com; |
| dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=bm6cq+OM | |
| X-Recipient: | archive-cygwin AT delorie DOT com |
| DKIM-Filter: | OpenDKIM Filter v2.11.0 sourceware.org BE4DE385840E |
| DKIM-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; |
| s=default; t=1753257243; | |
| bh=R1XsN3w43LVW+mO6pE2QKHQtlAg5rdF0/6AcsIhO5Rc=; | |
| h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: | |
| List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: | |
| From; | |
| b=bm6cq+OMmJJm9s4e/nDkhdf6tZyVBXZGzWkee+AjcHEk5Ibx6z3y3OCDEp6aTnkus | |
| pvlQlA/ilHxbhrF//qOG6makWCeiQQ3vX+ss5qFoRoKXn+Q5aNz2uQU991hKxeksWP | |
| TKq3qPpiBfoz88JRy51UN8nuNX/smq/lpnM4In68= | |
| X-Original-To: | cygwin AT cygwin DOT com |
| Delivered-To: | cygwin AT cygwin DOT com |
| DKIM-Filter: | OpenDKIM Filter v2.11.0 sourceware.org 333CB3858D3C |
| Date: | Wed, 23 Jul 2025 09:53:41 +0200 |
| To: | cygwin AT cygwin DOT com |
| Subject: | Re: readdir() returns inaccessible name if file was created with |
| invalid UTF-8 | |
| Message-ID: | <aICVBQzWUiCYwnL2@calimero.vinschen.de> |
| Mail-Followup-To: | cygwin AT cygwin DOT com |
| References: | <96f2253b-791b-b8a0-97dd-8d257eefb9b1 AT t-online DOT de> |
| <03c4fae7-7322-572c-ae72-52e300f0b438 AT t-online DOT de> | |
| <aFxRfI4NdZ8y5IlK AT calimero DOT vinschen DOT de> | |
| <f78c615c-aefe-b3d0-aada-5f9d0cf73a0a AT t-online DOT de> | |
| <aF5y15iQ840LxLYJ AT calimero DOT vinschen DOT de> | |
| <ca205dbd-907f-4552-9e5c-2cb0050f83a3 AT towo DOT net> | |
| <aH-MtwqARmDmLwoo AT calimero DOT vinschen DOT de> | |
| <91f26856-72b0-483b-8d04-bd90a27b6be0 AT towo DOT net> | |
| <4ab2c1b7-3164-4556-ba36-29814ecf5766 AT towo DOT net> | |
| <68f65634-8f4e-436b-ba6a-d30bdf882aaa AT towo DOT net> | |
| MIME-Version: | 1.0 |
| In-Reply-To: | <68f65634-8f4e-436b-ba6a-d30bdf882aaa@towo.net> |
| X-BeenThere: | cygwin AT cygwin DOT com |
| X-Mailman-Version: | 2.1.30 |
| List-Id: | General Cygwin discussions and problem reports <cygwin.cygwin.com> |
| List-Unsubscribe: | <https://cygwin.com/mailman/options/cygwin>, |
| <mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe> | |
| List-Archive: | <https://cygwin.com/pipermail/cygwin/> |
| List-Post: | <mailto:cygwin AT cygwin DOT com> |
| List-Help: | <mailto:cygwin-request AT cygwin DOT com?subject=help> |
| List-Subscribe: | <https://cygwin.com/mailman/listinfo/cygwin>, |
| <mailto:cygwin-request AT cygwin DOT com?subject=subscribe> | |
| From: | Corinna Vinschen via Cygwin <cygwin AT cygwin DOT com> |
| Reply-To: | cygwin AT cygwin DOT com |
| Cc: | Corinna Vinschen <corinna-cygwin AT cygwin DOT com> |
| Errors-To: | cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com |
| Sender: | "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com> |
On Jul 23 05:44, Thomas Wolff via Cygwin wrote:
> OK, suppose I'd consider to switch to mbs[[n]r]towcs, collecting bytes until
> the function gives me a result.
> This would work fine as long as I receive only valid sequences. But look at
> input string test case
> char nonbmp[] = {0xF8, 0x88, 0x8A, 0xAF, 0x2D, 0}; // an invalid sequence
> followed by a valid char
> The functions only return -1 and (in the case of mbsnrtowcs) do not advance
> the input pointer.
> So how am I supposed to recognize that the invalid sequence has ended and a
> valid character has arrived?
Yeah, I see the problem. One of the slightly puzzeling behaviours
of mbsnrtowcs is the fact that the src pointer stays at the start of
the invalid sequence. I think the idea is to skip the invalid sequence
byte-wise until wcsnrtombs reports a valid sequence again.
What bugs me is that we have the choice between a broken mbrtowc on
one side and a chance to generate broken filenames on the other side.
I think we should actually revert fa272e05bbd0 ("wcstombs: also call
__WCTOMB on terminating NUL if output buffer is NULL") and see if we can
fix the filename issue in the Cygwin functions for filename conversion
alone.
Any ideas appreciated.
Corinna
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
| webmaster | delorie software privacy |
| Copyright © 2019 by DJ Delorie | Updated Jul 2019 |