Mail Archives: cygwin/2011/01/29/12:22:01
--------------enigF4F1B794E755F7A78ED13804
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
On 01/29/2011 09:01 AM, Corinna Vinschen wrote:
>> So, using UTF-16 surrogate encodings for characters outside the basic
>> plane violates POSIX, but it's the best we can do for those characters.
>=20
> Right, and we discussed this already on this list. Or the developer
> list, I don't remember. Maybe we should have stick to the base plane
> and only use UCS-2 to be more POSIX compatible.
The burden is on the application, not on cygwin. If the application
wants POSIX behavior, then they obey __STDC_ISO_10646__ and use ONLY
characters from the basic plane (no surrogates), at which point their
use of wchar_t fits the POSIX definition (one wchar_t per character).
The moment they pass a surrogate, they are no longer honoring the
restriction documented by __STDC_ISO_10646__ so they are no longer under
the rules of POSIX, and then cygwin can do whatever it wants (and in
this case, QoI demands that we honor surrogates to the best of our
ability for full UTF-16 support, and you can have multi-wchar_t
characters just as you already have multi-byte UTF-8 char characters).
In other words, cygwin IS being POSIX-compliant by advertising only the
Unicode 4.0 character set in the __STDC_ISO_10646__, while still
supporting Unicode 5.2 (should we upgrade to Unicode 6.0?) as an
extension when you no longer care about POSIX.
> However, the POSIX definition doesn't contradict what I said about the
> definition of __STDC_ISO_10646__ as far as I'm concerned.
Yep - I think we're in violent agreement :)
--=20
Eric Blake eblake AT redhat DOT com +1-801-349-2682
Libvirt virtualization library http://libvirt.org
--------------enigF4F1B794E755F7A78ED13804
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Public key at http://people.redhat.com/eblake/eblake.gpg
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/
iQEcBAEBCAAGBQJNREysAAoJEKeha0olJ0NqfiwIAJZfj1vdLxRh3cyoPauQrBxG
d51zsO0dMg8bTFMY0cO6amh23/nV8HWD3rBNl3Qzusehl1HfQF1vGG7zZvkcATxN
0PdSM+uAkhbQ2dtwWakh5gr0ZUkMFDB5qFNU0PXRC+tloZ74+c2+7vVag1rYBBhg
HRKbK+hawbWBACyYPv7aLYCzd58JMJdccXA2CbuHony/aR3CiMHSpJplYdwzdNIg
W24mumKp/CPldpmutHlgGtb3mKhmgLkfumU5DoIWVQhox3rbWNu0Wwcihz50S71P
8VdDw0kb35eIErei3WfMzWTKSwJ9fzlaD6MRnXah0BJBz68N5+iXlaUu9qNKXUs=
=+NPU
-----END PGP SIGNATURE-----
--------------enigF4F1B794E755F7A78ED13804--
- Raw text -