delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/1998/02/12/07:24:15

Sender: vheyndri AT rug DOT ac DOT be
Message-Id: <34E2E985.76E7@rug.ac.be>
Date: Thu, 12 Feb 1998 13:22:29 +0100
From: Vik Heyndrickx <Vik DOT Heyndrickx AT rug DOT ac DOT be>
Mime-Version: 1.0
To: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
Cc: djgpp-workers AT delorie DOT com
Subject: Re: char != unsigned char... sometimes, sigh (long)
References: <Pine DOT SUN DOT 3 DOT 91 DOT 980212134847 DOT 19988C-100000 AT is>

Eli Zaretskii wrote:
> On Thu, 12 Feb 1998, Vik Heyndrickx wrote:
> > A char is supposed to be able to contain characters (among other
> > things), and when a char does contain a character it is supposed to be
> > positive. Don't say that is true. If you knew nothing about computers,
> > and someone did explain to you that characters are internally
> > represented by numbers, you would NOT expect these numbers to be
> > negative, unless you have a very twisted mind.
> 
> That's what I mean: those who think that way don't realize that C thinks
> otherwise.  Consider the following excerpt from the GCC manual:
> 
> `-Wchar-subscripts'
>      Warn if an array subscript has type `char'.  This is a common cause
>      of error, as programmers often forget that this type is signed on
>      some machines.
> 
> > but there is something very unnatural in negative character codes.

I know that warning option. It is just one reason more why we shouldn't
worry about changing the default to 'unsigned char'. The compiler still
issues a warning then. Only when char is 'signed char' the program
generates buggy code. When 'char' is 'unsigned char' the program is not
buggy, only less-portable.

> The type char is not limited to character codes, it's just a small
> integer.

I wrote:
> > A char is supposed to be able to contain characters (among other
> > things).

When K&R wrote their first implementation, I think they would find the
statement "The type char IS limited to character codes" acceptable at
that time.

When I should make a tutorial for the modified djgpp compiler, about
this I would write:
- signed char type can contain integers in the range -128 .. 127
- unsigned char type can contain positive integers in the range 0 .. 255
- char is a type of the same size as 'signed' and 'unsigned', but
distinct from 'signed char' and 'unsigned char'. Variables of this type
should be avoided because on some systems this type is 'signed' and
others 'unsigned'. Hence these variables cannot portably be passed to
the is* macro's or returned from the fgetc function without explicit
casting.

Since '\x84' IS a char (which is automatically promoted to an int), its
signedness depends upon that. And therefore expession like
toupper('\x84') CAN NEVER work as long as '\x84' is negative.

-- 
 \ Vik /-_-_-_-_-_-_/   
  \___/ Heyndrickx /          
   \ /-_-_-_-_-_-_/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019