DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 55SGEFA11953115
Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com
Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com
DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 55SGEFA11953115
Authentication-Results: delorie.com;
	dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=iNcRaQI0
X-Recipient: archive-cygwin@delorie.com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C5FB8385F025
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
	s=default; t=1751127254;
	bh=3xAxaOjjg8DobL3EuwgL56nuXIdruo3dqNbrXcXPlBM=;
	h=To:Subject:Date:References:In-Reply-To:List-Id:List-Unsubscribe:
	 List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
	 From;
	b=iNcRaQI01yCWKtO179282xbHvDpQVdL3ZR5vDbA+QaZ1cC5a2lNdP0RQMzjgSVxqV
	 jPb8bB9k31BQC/91X0d+8dQGTx7VNlkcHgZmofRgGKntNRdd5Hdhx7vjokoLn1iknl
	 fFqYqBxriTM0uhDI23ayGPDdanrsOX+EJUCqOGHE=
X-Original-To: cygwin@cygwin.com
Delivered-To: cygwin@cygwin.com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 92EA7385E457
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 92EA7385E457
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1751127196; cv=none;
 b=I3OoF87iB8OhzOkpRoSQrrbxx/fnxl7EU9P6XR6NWSjcAF2434n7OSx1ghiU8s91FJ+Qgi+EiYMQSXCLDvmG0el8JWLNXT05zlv9gW72Hb19M3AG+6fR8Yjvt/L2DE7FhJCu3OqyEU/XkTYuAfE/XqN8PoLodVt/8FUxtohpnZM=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1751127196; c=relaxed/simple;
 bh=R975WMSBqzjTt6zU9XEXfkufEfEy9cYxdNfZJPWGGsc=;
 h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;
 b=OYWP05HYtrWj7PeFXUSfN6oSvkQ0va6QQmdEoVsVLUvTAmoLzVwPQBqh9Fbc8OMvHgcQPajTB4TrC7oQL4/iKsbZV+PMepokQyie4xnKFoxvMIorT6BS+A2+QB22UuZAH48lVH0gbmeMg7mvhmOhEQlI+d50l8xJD1RMQMvlawY=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 92EA7385E457
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1751127195; x=1751731995;
 h=content-transfer-encoding:mime-version:in-reply-to:references
 :message-id:date:subject:to:from:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=0WkNckjmI7EtU4ga6oUayzql64cHFV45b6SFY0nSIRw=;
 b=qtTrv3lO4OJeGiMNPfCIUbmmFhgEH08bI3zmOUO0MJ1Zz65iKCAoMGpzamhvkdszm9
 2mSaJarzPLAaBs+pgbenKRC43fQdFUBF/Bkmy8I5dPD7gKjhabFUbZqaaUAeFsjwch91
 RGYoNKUWq3eHJpin62tMe5YS5cDAeBaTv4MmOnhx1P5BqRvB+AyEYJpb8X/auM3Ad+km
 ekqgBpvily9whYRk6CtLC6XEm/f5mDA4jI6HZ+dr4XYnvuXMj/ibwo6gwauvIP69Ri2B
 bAaaJtH7drgXsZMZ8lSFNXPRT3FbywLHIGyuz1Yuaj7RsxVxuNRhLh5mFnS9/imZsxJR
 KRTw==
X-Gm-Message-State: AOJu0Yy5uGf2/Xa+RNzIGtPpZeO9YwGG2dFBlqwcCGXADYwRk8Vccqcw
 tDyFxOpSTSq9+dMzuXtTzXGAJek5ElVomP5W+6nN6xHAvqjD9w5D1JNX3/neQg==
X-Gm-Gg: ASbGncuSPqiH3K9LiMmjamGUeG3JjTPEBt/jjB24GWFJ6hoctkvsE6Ea5crG8k0yvKB
 rQgPJxpuHEpQfJmzEKuhlvoss3y5OscBN14pUMXj/awO3NfHmHSkFXVPi3OLJ2yU+cRDlLICnmn
 djHb4qN3Eq1GDSK1As7cir8D2JiLYeVf6GUlJp9y6F3xiIycRZqnfqZF+AbEoYYtwa2rmuqPcEV
 w5OKLlDoyIKUTrGZ0KhuXYzwlumWehYbzLEtB5F0/J4IyG3al2o4tY3H590nOT/QKTYfIYD01KY
 Heb2W4HIIqS1vU9RFH+S31bd4BTcGduiLiT6VNq5qcAbgfz8DFGa01kI41EGesE5hql4lptYD4X
 tfv8dn09PTpJJx8irlgn1W0GBvuvuD2jW54NTbREJRvwcQHyKV/1BAQlyMB/5Yc6+
X-Google-Smtp-Source: AGHT+IHhX1xgaYCnXIyudtAh/5ux6fPPlHpJscO4iY1VkxEBina18yfF7lzxRutl/hV6GhY6lQgWOw==
X-Received: by 2002:a05:6214:29e5:b0:6fa:a4ed:cce5 with SMTP id
 6a1803df08f44-70014032203mr104773696d6.44.1751127195569; 
 Sat, 28 Jun 2025 09:13:15 -0700 (PDT)
To: cygwin@cygwin.com
Subject: Re: readdir() returns inaccessible name if file was created with
 invalid UTF-8
Date: Sat, 28 Jun 2025 12:13:13 -0400
Message-ID: <qe106k9gc8ipgfr8n231mkepp32fd1s52i@4ax.com>
References: <96f2253b-791b-b8a0-97dd-8d257eefb9b1@t-online.de>
 <03c4fae7-7322-572c-ae72-52e300f0b438@t-online.de>
 <aFxRfI4NdZ8y5IlK@calimero.vinschen.de>
 <f78c615c-aefe-b3d0-aada-5f9d0cf73a0a@t-online.de>
 <aF5y15iQ840LxLYJ@calimero.vinschen.de>
In-Reply-To: <aF5y15iQ840LxLYJ@calimero.vinschen.de>
X-Mailer: Forte Agent 4.2/32.1118
MIME-Version: 1.0
X-BeenThere: cygwin@cygwin.com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-request@cygwin.com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=subscribe>
From: Andrew Schulman via Cygwin <cygwin@cygwin.com>
Reply-To: Andrew Schulman <andrex.e.schulman@gmail.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie.com@cygwin.com>

>> Testcase (attached):
>
>Thanks for the testcase!
>
>I found the problem in the newlib core function creating wchar_t from
>UTF-8 input.  In case of 4 byte UTF-8 sequences, the code created the
>low surrogate already after reading byte 3, without checking if byte 4
>of the UTF-8 sequence is a valid byte. Hilarity ensues.
>
>Fortunately this bug has only been introduced very recently, to wit, on
>2009-03-24, a mere 16 years ago.  And it is my bug and mine alone :}
>
>I'm just prep'ing a fix which I'll push in a minute or two.

Gold star awarded! https://cygwin.com/goldstars/#CV

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple
