delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2003/08/29/08:45:04

Message-ID: <3F4F03C2.D9D1F88@acm.org>
From: Eric Sosman <esosman AT acm DOT org>
X-Mailer: Mozilla 4.72 [en] (Win95; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: comp.os.msdos.djgpp
Subject: Re: wide character functions
References: <Kq53b.5954$L15 DOT 1502 AT newsfep4-winn DOT server DOT ntli DOT net> <2427-Thu28Aug2003000602+0300-eliz AT elta DOT co DOT il> <Dxj3b.94$c12 DOT 961 AT newsfep4-glfd DOT server DOT ntli DOT net> <8296-Thu28Aug2003162425+0300-eliz AT elta DOT co DOT il> <3F4E90EF DOT 33122DA4 AT phekda DOT freeserve DOT co DOT uk> <2110-Fri29Aug2003133636+0300-eliz AT elta DOT co DOT il>
Lines: 30
Date: Fri, 29 Aug 2003 12:39:48 GMT
NNTP-Posting-Host: 12.91.7.222
X-Complaints-To: abuse AT worldnet DOT att DOT net
X-Trace: bgtnsc04-news.ops.worldnet.att.net 1062160788 12.91.7.222 (Fri, 29 Aug 2003 12:39:48 GMT)
NNTP-Posting-Date: Fri, 29 Aug 2003 12:39:48 GMT
Organization: AT&T Worldnet
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Eli Zaretskii wrote:
> 
> Wide characters is one representation of non-ASCII characters.
> Another representation, which should also be supported by the library,
> is the multibyte representation, whereby every characters is
> represented as a series of 8-bit bytes.  (Many libraries choose UTF-8
> as their multibyte representation.)  The is* macros should support the
> multibyte representation in a manner equivalent to what the isw*
> macros do with the wide characters.  That is, if you pass a wide
> representation of a character CH to iswprint and the multibyte
> representation of the same character to isprint, you should get ther
> same result (I think).

    No; is*() and to*() work only with "plain"
one-`char'-is-one-character
data.  (And with the special value EOF, of course.)  In fact, since the
argument to any of these is the value of a single character, there's no
way they could see the second and subsequent characters of a multibyte
encoding.

    The point remains, though, that introducing wide character and
multibyte support involves more than merely implementing the functions
with `w' in their names.  For example, the *printf() family must be
made aware of multibyte encodings (searching the format string for the
single character '%' does *not* suffice), and in C99 a FILE* stream
can have either wide- or narrow-character orientation.

-- 
Eric Sosman
esosman AT acm DOT org

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019