Sender: vheyndri AT rug DOT ac DOT be Message-Id: <34E2E985.76E7@rug.ac.be> Date: Thu, 12 Feb 1998 13:22:29 +0100 From: Vik Heyndrickx Mime-Version: 1.0 To: Eli Zaretskii Cc: djgpp-workers AT delorie DOT com Subject: Re: char != unsigned char... sometimes, sigh (long) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Precedence: bulk Eli Zaretskii wrote: > On Thu, 12 Feb 1998, Vik Heyndrickx wrote: > > A char is supposed to be able to contain characters (among other > > things), and when a char does contain a character it is supposed to be > > positive. Don't say that is true. If you knew nothing about computers, > > and someone did explain to you that characters are internally > > represented by numbers, you would NOT expect these numbers to be > > negative, unless you have a very twisted mind. > > That's what I mean: those who think that way don't realize that C thinks > otherwise. Consider the following excerpt from the GCC manual: > > `-Wchar-subscripts' > Warn if an array subscript has type `char'. This is a common cause > of error, as programmers often forget that this type is signed on > some machines. > > > but there is something very unnatural in negative character codes. I know that warning option. It is just one reason more why we shouldn't worry about changing the default to 'unsigned char'. The compiler still issues a warning then. Only when char is 'signed char' the program generates buggy code. When 'char' is 'unsigned char' the program is not buggy, only less-portable. > The type char is not limited to character codes, it's just a small > integer. I wrote: > > A char is supposed to be able to contain characters (among other > > things). When K&R wrote their first implementation, I think they would find the statement "The type char IS limited to character codes" acceptable at that time. When I should make a tutorial for the modified djgpp compiler, about this I would write: - signed char type can contain integers in the range -128 .. 127 - unsigned char type can contain positive integers in the range 0 .. 255 - char is a type of the same size as 'signed' and 'unsigned', but distinct from 'signed char' and 'unsigned char'. Variables of this type should be avoided because on some systems this type is 'signed' and others 'unsigned'. Hence these variables cannot portably be passed to the is* macro's or returned from the fgetc function without explicit casting. Since '\x84' IS a char (which is automatically promoted to an int), its signedness depends upon that. And therefore expession like toupper('\x84') CAN NEVER work as long as '\x84' is negative. -- \ Vik /-_-_-_-_-_-_/ \___/ Heyndrickx / \ /-_-_-_-_-_-_/