delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2001/02/04/09:18:19

Message-ID: <001801c08eb5$4cf7b1e0$23d4883e@oemcomputer>
From: "Stephen Silver" <djgpp AT argentum DOT freeserve DOT co DOT uk>
To: <djgpp-workers AT delorie DOT com>
Subject: Re: stdint.h
Date: Sun, 4 Feb 2001 14:18:06 -0000
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 4.72.3110.1
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3
Reply-To: djgpp-workers AT delorie DOT com

Eli Zaretskii wrote:

> On Sat, 3 Feb 2001, Stephen Silver wrote:
>
> > Also WINT_MIN should be -2147483648 rather than 0, since it's defined
> > as int.
>
> Thanks.
>
> However, I'm not sure we need to push it as far as -2147483648.  wint_t 
> should hold everything wchar_t does and WEOF.  C99 also seems to require 
> that WINT_MIN is at most -32767, which seems to be sufficient both for 
> wchar_t, which is unsigned short, and for WEOF, which is -1.
>
> So what are the reasons for pushing WINT_MIN all the way to INT_MIN?

I assumed that WINT_MIN was supposed to represent the minimum
possible value of a wint_t.  However, the C99 standard (or, at least,
the draft) does not seem to say this explicitly.  Nonetheless, I
think it would be strange if it were possible to assign a value less
than WINT_MIN to a wint_t.

From the C++ point of view it should also be noted that

   std::numeric_limits<wint_t>::min() == INT_MIN

if wint_t is typedef'ed as int.  I think users would expect that
WINT_MIN == std::numeric_limits<wint_t>::min().

> > > > (and it will need to hold values higher than 32767 if it is ever to
> > > > be used for Unicode).
> > >
> > > 64K isn't enough for Unicode anyway, only for the BMP.
> > 
> > Section 5.2 of the Unicode Standard disagrees with you, as it
> > talks about using wchar_t for Unicode, and makes it clear that
> > a 16-bit wchar_t is quite sufficient.  Unicode is designed to
> > be 16-bit - that's why it has surrogate pairs.
>
> Well, I _was_ talking about surrogates, specifically.  I was also
> talking about planes beyond plane 0, the BMP.

OK, but I'm not sure what point you are trying to make here.
Unicode (unlike ISO/IEC 10646) recognises only 16 planes beyond
the BMP.  Characters in these supplementary planes are encoded as
pairs of surrogates, and the surrogates themselves lie in the BMP.
So a 16-bit unsigned wchar_t suffices for the whole of Unicode.

Stephen

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019