Message-ID: <3F4F03C2.D9D1F88@acm.org> From: Eric Sosman X-Mailer: Mozilla 4.72 [en] (Win95; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.os.msdos.djgpp Subject: Re: wide character functions References: <2427-Thu28Aug2003000602+0300-eliz AT elta DOT co DOT il> <8296-Thu28Aug2003162425+0300-eliz AT elta DOT co DOT il> <3F4E90EF DOT 33122DA4 AT phekda DOT freeserve DOT co DOT uk> <2110-Fri29Aug2003133636+0300-eliz AT elta DOT co DOT il> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 30 Date: Fri, 29 Aug 2003 12:39:48 GMT NNTP-Posting-Host: 12.91.7.222 X-Complaints-To: abuse AT worldnet DOT att DOT net X-Trace: bgtnsc04-news.ops.worldnet.att.net 1062160788 12.91.7.222 (Fri, 29 Aug 2003 12:39:48 GMT) NNTP-Posting-Date: Fri, 29 Aug 2003 12:39:48 GMT Organization: AT&T Worldnet To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com Eli Zaretskii wrote: > > Wide characters is one representation of non-ASCII characters. > Another representation, which should also be supported by the library, > is the multibyte representation, whereby every characters is > represented as a series of 8-bit bytes. (Many libraries choose UTF-8 > as their multibyte representation.) The is* macros should support the > multibyte representation in a manner equivalent to what the isw* > macros do with the wide characters. That is, if you pass a wide > representation of a character CH to iswprint and the multibyte > representation of the same character to isprint, you should get ther > same result (I think). No; is*() and to*() work only with "plain" one-`char'-is-one-character data. (And with the special value EOF, of course.) In fact, since the argument to any of these is the value of a single character, there's no way they could see the second and subsequent characters of a multibyte encoding. The point remains, though, that introducing wide character and multibyte support involves more than merely implementing the functions with `w' in their names. For example, the *printf() family must be made aware of multibyte encodings (searching the format string for the single character '%' does *not* suffice), and in C99 a FILE* stream can have either wide- or narrow-character orientation. -- Eric Sosman esosman AT acm DOT org