delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2025/07/24/11:37:14

DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 56OFbDAd1498479
Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com
Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com
DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 56OFbDAd1498479
X-Recipient: archive-cygwin AT delorie DOT com
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6B24B3856DF0
Date: Thu, 24 Jul 2025 17:36:38 +0200
To: cygwin AT cygwin DOT com
Subject: Re: readdir() returns inaccessible name if file was created with
invalid UTF-8
Message-ID: <aIJTBuRjlVNDNWD7@calimero.vinschen.de>
Mail-Followup-To: cygwin AT cygwin DOT com
References: <aF5y15iQ840LxLYJ AT calimero DOT vinschen DOT de>
<ca205dbd-907f-4552-9e5c-2cb0050f83a3 AT towo DOT net>
<aH-MtwqARmDmLwoo AT calimero DOT vinschen DOT de>
<91f26856-72b0-483b-8d04-bd90a27b6be0 AT towo DOT net>
<4ab2c1b7-3164-4556-ba36-29814ecf5766 AT towo DOT net>
<68f65634-8f4e-436b-ba6a-d30bdf882aaa AT towo DOT net>
<aICVBQzWUiCYwnL2 AT calimero DOT vinschen DOT de>
<11282182-60d1-4841-bf78-5ef78cf30060 AT towo DOT net>
<aIILWiKsr99DOaI8 AT calimero DOT vinschen DOT de>
<aec69850-227c-4c37-8aa9-6ea97dbec25b AT systematicsw DOT ab DOT ca>
MIME-Version: 1.0
In-Reply-To: <aec69850-227c-4c37-8aa9-6ea97dbec25b@systematicsw.ab.ca>
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Corinna Vinschen via Cygwin <cygwin AT cygwin DOT com>
Reply-To: cygwin AT cygwin DOT com
Cc: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>

On Jul 24 09:28, Brian Inglis via Cygwin wrote:
> On 2025-07-24 04:30, Corinna Vinschen via Cygwin wrote:
> > Or shall simply go along with CESU-8 when converting back to multibyte
> > to keep the string the same as with wcstombs?
> 
> There are 15 * SMP as BMP characters, so many non-Western and emoji
> characters will be expanded from 4 UTF-8 bytes to 6 CESU-8 bytes, and this
> is not supported anywhere as a string representation, designed for internal
> use only per the TR.

We're only talking about invalid sequences, not using CESU-8 throughout.


Corinna

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019