delorie.com/archives/browse.cgi | search |
X-Recipient: | archive-cygwin AT delorie DOT com |
DomainKey-Signature: | a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id |
:list-unsubscribe:list-subscribe:list-archive:list-post | |
:list-help:sender:date:from:to:subject:message-id:reply-to | |
:references:mime-version:content-type:in-reply-to; q=dns; s= | |
default; b=MDFDkTdxRMhR34XPvMzExTt+b5XFvOeHhk96B42Zvo9OG602LjK1j | |
+0oWa8+mxzL8WCpi116uP/SnnXaaQmExuHNFcN2J6zr0n1KwLCQYWJP0LMQF1CDO | |
86QaiFKtkeAZA4r08Zd5tQDVpWNe+DXaZWaIYMRn/5kkw8Njg/wNzA= | |
DKIM-Signature: | v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id |
:list-unsubscribe:list-subscribe:list-archive:list-post | |
:list-help:sender:date:from:to:subject:message-id:reply-to | |
:references:mime-version:content-type:in-reply-to; s=default; | |
bh=rgTqi3ddoMZxkcAgsiKd32sTvHE=; b=aIhXScPNAjTa+ATEZdbM+V0xuplM | |
lC3YKpyRe/nFgwK+OqOF4ZhpNNreu+zk8w6tEK+sp5GW3cVMfO5x6fsQdrAuWzjl | |
y8/FjSrr20wuVAJ7pPKbnVvJcrzyPly3mpmji3/PWHUl/lvqqumOzsEMmTLOu0Ci | |
e/Wzj3pSNIq0a+8= | |
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
List-Id: | <cygwin.cygwin.com> |
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
List-Archive: | <http://sourceware.org/ml/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
Sender: | cygwin-owner AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
Delivered-To: | mailing list cygwin AT cygwin DOT com |
Authentication-Results: | sourceware.org; auth=none |
X-Virus-Found: | No |
X-Spam-SWARE-Status: | No, score=-3.6 required=5.0 tests=AWL,BAYES_50,LIKELY_SPAM_SUBJECT autolearn=no version=3.3.2 |
X-HELO: | calimero.vinschen.de |
Date: | Wed, 1 Apr 2015 18:10:29 +0200 |
From: | Corinna Vinschen <corinna-cygwin AT cygwin DOT com> |
To: | cygwin AT cygwin DOT com |
Subject: | Re: With bad UTF-8, cygwin can create files it can't read |
Message-ID: | <20150401161029.GB13285@calimero.vinschen.de> |
Reply-To: | cygwin AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
References: | <CAOCY71AaRWGEFVcPqLKNEjqWEkELdfLD-KBvxMAQCi0wt2A5ZA AT mail DOT gmail DOT com> <20150330110446 DOT GK29875 AT calimero DOT vinschen DOT de> <20150401133401 DOT GV13285 AT calimero DOT vinschen DOT de> |
MIME-Version: | 1.0 |
In-Reply-To: | <20150401133401.GV13285@calimero.vinschen.de> |
User-Agent: | Mutt/1.5.23 (2014-03-12) |
--Z0wTxTCd2IDq3u/i Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Apr 1 15:34, Corinna Vinschen wrote: > Hi Stuart, >=20 > On Mar 30 13:04, Corinna Vinschen wrote: > > On Mar 25 14:34, Kyzer wrote: > > > Hello, > > >=20 > > > I've found that if you use cygwin to create a file with badly-encoded > > > UTF-8, readdir() gives out an entry with a name that cygwin won't > > > subsequently accept. > > >=20 > > > * create a file using filename with hex bytes F4 8F BF BF > > > * readdir() reports the filename as hex bytes E2 8E B3 ED BF BF > > > * attempting to open or unlink the filename E2 8E B3 ED BF BF fails > > > * attempting to open or unlink the filename F4 8F BF BF succeeds > >=20 > > Thanks for the testcase. I'll have a look later this week (I hope). >=20 > Wow. Just wow. You found a long-standing bug in the wctomb conversion > from UTF-16 to UTF-8. >=20 > As you probably know, Unicode values beyond the base plane (that is, > everything > 0xffff in UTF-32 and > ef bf bf in UTF-8 notation) > are represented as so-called surrogate pairs in UTF-16, two UTF-16 > values in the 0xd800 - 0xdfff range. >=20 > While the conversion from UTF-8 f4 8f Bf Bf to UTF-16 dbff dfff > worked fine, the conversion back to UTF-8 has a subtil bug. There's > a test for a lone high surrogate pair in the underlying conversion > function. This tests the next UTF-16 value like this: >=20 > if (wchar < 0xdc00 || wchar >=3D 0xdfff) > /* Handle lone high surrogate */ >=20 > Notice the >=3D 0xdfff? That should have been > 0xdfff. Duh. This > bug is only a bit over 5 years old... >=20 > Fixed in the git repo. I'l regenerate the today's fool..., erm, the > today's developer snapshot on https://cygwin.com/snapshots/ later today. Snapshot is up. Please give it a try. Thanks, Corinna --=20 Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat --Z0wTxTCd2IDq3u/i Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBAgAGBQJVHBh0AAoJEPU2Bp2uRE+gCKgP/11hhfr8MLbM1vm4WcbSveYo SohCWaKS9imjXTYGflhgTBjOCxmndB6FOoS3fq3LuGwyFQ8/1niB0CnVAc6lE9le 1TDD+bTULE8TGqpWmdVVi/FjUX4P8bB7qnsFREmNx0D2NUy5dOGobOIAASqBzK33 Xs09ShcDC6F697a/I0Z4w8+YB5PR2PzPpIw6N9mHjpP3fu9FR6eMNnx2l9x8TU0U bNc8qRrG1nWWHwn4K0G+JpLiLJfkW46EPj8gvpBGbVeSlpRDqmGwJKRPO4OFsRci 3rGrAijdtatNZzOgbSlLOlH391XaSqQSBg3PM4VtYjbUVSvgs76ArNaJFa9UyrHh BQa0sZFmYUkYVOIAqPYfqKF/iGMPAW9jhlD/DsETgRMijq1ZoNvEIGZJlQoJVzLL g+SHPVxYzaIC2ssVlNftqKeGVdIMhiJUA5du7Rga9rB3gAJQwC1/x3mVeCU9RXeh f+x7EGQvS/IdSLjVqCg1xYLOpeGZWDuQ2mrl29LZOEx/xceG6IcQOi2JCZ7Y1vUH si8ktTyl97d1bN7h7HbgBG+1QcnBNvy0Syd+/CHxh7dZ7CFyI/AO2XpfE2+T6sdw 1aIS3h0Q+x0KXIggw15WBumOWRVz7Uhns71bCyAGE0sEAPmX2QTy5zJ0/WyJ/6h4 +y3QQBV32YcLl6UE22EP =amxW -----END PGP SIGNATURE----- --Z0wTxTCd2IDq3u/i--
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |