Mail Archives: djgpp/2003/08/29/08:45:04

delorie.com/archives/browse.cgi

search

Mail Archives: djgpp/2003/08/29/08:45:04

Message-ID: <3F4F03C2.D9D1F88@acm.org>

From: Eric Sosman <esosman AT acm DOT org>

X-Mailer: Mozilla 4.72 [en] (Win95; U)

X-Accept-Language: en

MIME-Version: 1.0

Newsgroups: comp.os.msdos.djgpp

Subject: Re: wide character functions

References: <Kq53b.5954$L15 DOT 1502 AT newsfep4-winn DOT server DOT ntli DOT net> <2427-Thu28Aug2003000602+0300-eliz AT elta DOT co DOT il> <Dxj3b.94$c12 DOT 961 AT newsfep4-glfd DOT server DOT ntli DOT net> <8296-Thu28Aug2003162425+0300-eliz AT elta DOT co DOT il> <3F4E90EF DOT 33122DA4 AT phekda DOT freeserve DOT co DOT uk> <2110-Fri29Aug2003133636+0300-eliz AT elta DOT co DOT il>

Lines: 30

Date: Fri, 29 Aug 2003 12:39:48 GMT

NNTP-Posting-Host: 12.91.7.222

X-Complaints-To: abuse AT worldnet DOT att DOT net

X-Trace: bgtnsc04-news.ops.worldnet.att.net 1062160788 12.91.7.222 (Fri, 29 Aug 2003 12:39:48 GMT)

NNTP-Posting-Date: Fri, 29 Aug 2003 12:39:48 GMT

Organization: AT&T Worldnet

To: djgpp AT delorie DOT com

DJ-Gateway: from newsgroup comp.os.msdos.djgpp

Reply-To: djgpp AT delorie DOT com

Eli Zaretskii wrote:
> 
> Wide characters is one representation of non-ASCII characters.
> Another representation, which should also be supported by the library,
> is the multibyte representation, whereby every characters is
> represented as a series of 8-bit bytes.  (Many libraries choose UTF-8
> as their multibyte representation.)  The is* macros should support the
> multibyte representation in a manner equivalent to what the isw*
> macros do with the wide characters.  That is, if you pass a wide
> representation of a character CH to iswprint and the multibyte
> representation of the same character to isprint, you should get ther
> same result (I think).

    No; is*() and to*() work only with "plain"
one-`char'-is-one-character
data.  (And with the special value EOF, of course.)  In fact, since the
argument to any of these is the value of a single character, there's no
way they could see the second and subsequent characters of a multibyte
encoding.

    The point remains, though, that introducing wide character and
multibyte support involves more than merely implementing the functions
with `w' in their names.  For example, the *printf() family must be
made aware of multibyte encodings (searching the format string for the
single character '%' does *not* suffice), and in C99 a FILE* stream
can have either wide- or narrow-character orientation.

-- 
Eric Sosman
esosman AT acm DOT org

- Raw text -

webmaster	delorie software privacy
Copyright © 2019 by DJ Delorie	Updated Jul 2019

Message-ID:	<3F4F03C2.D9D1F88@acm.org>
From:	Eric Sosman <esosman AT acm DOT org>
X-Mailer:	Mozilla 4.72 [en] (Win95; U)
X-Accept-Language:	en
MIME-Version:	1.0
Newsgroups:	comp.os.msdos.djgpp
Subject:	Re: wide character functions
References:	<Kq53b.5954$L15 DOT 1502 AT newsfep4-winn DOT server DOT ntli DOT net> <2427-Thu28Aug2003000602+0300-eliz AT elta DOT co DOT il> <Dxj3b.94$c12 DOT 961 AT newsfep4-glfd DOT server DOT ntli DOT net> <8296-Thu28Aug2003162425+0300-eliz AT elta DOT co DOT il> <3F4E90EF DOT 33122DA4 AT phekda DOT freeserve DOT co DOT uk> <2110-Fri29Aug2003133636+0300-eliz AT elta DOT co DOT il>
Lines:	30
Date:	Fri, 29 Aug 2003 12:39:48 GMT
NNTP-Posting-Host:	12.91.7.222
X-Complaints-To:	abuse AT worldnet DOT att DOT net
X-Trace:	bgtnsc04-news.ops.worldnet.att.net 1062160788 12.91.7.222 (Fri, 29 Aug 2003 12:39:48 GMT)
NNTP-Posting-Date:	Fri, 29 Aug 2003 12:39:48 GMT
Organization:	AT&T Worldnet
To:	djgpp AT delorie DOT com
DJ-Gateway:	from newsgroup comp.os.msdos.djgpp
Reply-To:	djgpp AT delorie DOT com