delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2015/04/01/12:16:48

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; q=dns; s=
default; b=ZRkTJ3BeEYmlIOk+TepAz7ShS5wj2WLNRBZC7sXa7TWxE2B8k3V6C
lOMXcK7wl31QlzqN/rzR/TG/6eIEijbHvM9ETOdkSUupKjHXtLE80ubgPgE+7m2z
tcw74QNGZARwRtXmUaxgk5Jsi9Nqw99XMs/Q2i7arYsJLFRKLrsQmk=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; s=default;
bh=WdHnnkrKpKNg5ZjWnd6Oq8d8wDQ=; b=qmUMQkNB9jU6Qh1u2rGjJuLK9atz
FaOxObkTzzI3z8OFSY6cumn0MFF2vnUoc3bj8dYQTl2C+SGuUMowioJ3oKJBl+r8
U3J0RH/GBH7Okqpsm+/VcpIsFi1MAhmYxekNomJp+0WBEftHCcNdtwniyhp4bYkv
gaiNgXJ9I4Gins0=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-3.6 required=5.0 tests=AWL,BAYES_50,LIKELY_SPAM_SUBJECT autolearn=no version=3.3.2
X-HELO: calimero.vinschen.de
Date: Wed, 1 Apr 2015 18:16:27 +0200
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: With bad UTF-8, cygwin can create files it can't read
Message-ID: <20150401161627.GC13285@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <CAOCY71AaRWGEFVcPqLKNEjqWEkELdfLD-KBvxMAQCi0wt2A5ZA AT mail DOT gmail DOT com> <20150330110446 DOT GK29875 AT calimero DOT vinschen DOT de> <20150401133401 DOT GV13285 AT calimero DOT vinschen DOT de> <F7BC8B64-DE90-4F01-9C8F-2BB3511B4EF5 AT etr-usa DOT com>
MIME-Version: 1.0
In-Reply-To: <F7BC8B64-DE90-4F01-9C8F-2BB3511B4EF5@etr-usa.com>
User-Agent: Mutt/1.5.23 (2014-03-12)

--TOkWJigZa0YodlBE
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Apr  1 10:01, Warren Young wrote:
> On Apr 1, 2015, at 7:34 AM, Corinna Vinschen <corinna-cygwin AT cygwin DOT com> =
wrote:
> >=20
> > As you probably know, Unicode values beyond the base plane (that is,
> > everything > 0xffff in UTF-32 and > ef bf bf in UTF-8 notation)
> > are represented as so-called surrogate pairs in UTF-16, two UTF-16
> > values in the 0xd800 - 0xdfff range.
>=20
> I happened to have run across a similar strangeness in Unicode earlier
> today.  Does Cygwin cope with/care about Unicode normalization forms?

Not at all.  UTF-8 string in, equivalent UTF-16 string out and vice versa,
on the bit level.  Additionally there's a replacement for UTF-16 values
which can't be handled by the current (non-UTF-8) codeset, e.g. ISO8859-1:
ASCII CAN followed by the UTF-8 representation of the UTF-16 character.


Corinna

--=20
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

--TOkWJigZa0YodlBE
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBAgAGBQJVHBnbAAoJEPU2Bp2uRE+gLwIP/3hK0Ek39WBTqDfzCA47Ko08
cFHRNhYSt83nzcndCnWveNfadAl87Glag/bKe7qkOZiII3pCMs9PwLjAPXzgag3R
n/Iz9Dm4OkN6Fqy5KLh3mBpGyTh0pkOa2Ue7MQt+QEMIOek7va1OQ6UErcMc2Pph
qJL/i3cb73yfZdu/sxFH0v5BkWc/nbJ0RFWaE6kPzLWiN2iv3lLqB6EGHaXyFqkO
fseJrWPZ5KZZI/g8Re22dr58nhOiUW8Xi0/JAbZEUq8SDKKQ1iQ+1ntISY/nFZph
bBM1MJjKIrzhTo+zZB7XKNwUfJmpZ4rzQlnuuD8RId8sv95iPbnKDpqY35QuVEeK
AVcTALhzOCVxc7LaMZZU5LZfUUVap29KRWfp5WEr1IDdB+zMrdcoonr9AfXND00g
LDviPG4HX9mJ1TmNilCefMfCzrhbEriPl1QzWnF98+h8iX+pBuz1nSCNKx1t6nUW
52sAayKpy9P/OPIU/zC14FOlKmEovLwn1lpyjCSONX/qqlQ4os1EjSkjKr2TofJ3
9+48ENnaASq2rt9+LwB5FjCdSkVq0u0dvkPixcOcmJfouvTutJaEZbO0nEEcWOsu
fEc5usk/w2oixKdj04xJFSBPg+wEoZStlRIQzJA6xKaMCupIYBfD1QbK4TOQPzq7
PHGWls8UDb+pgOMp4hAH
=nByf
-----END PGP SIGNATURE-----

--TOkWJigZa0YodlBE--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019