delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/1998/02/11/06:20:08

Sender: vheyndri AT rug DOT ac DOT be
Message-Id: <34E18927.4CCE@rug.ac.be>
Date: Wed, 11 Feb 1998 12:19:03 +0100
From: Vik Heyndrickx <Vik DOT Heyndrickx AT rug DOT ac DOT be>
Mime-Version: 1.0
To: Hans-Bernhard Broeker <broeker AT physik DOT rwth-aachen DOT de>
Cc: DJ Delorie <dj AT delorie DOT com>, djgpp-workers AT delorie DOT com
Subject: Re: char != unsigned char... sometimes, sigh (long)
References: <Pine DOT LNX DOT 3 DOT 93 DOT 980210162720 DOT 32596C-100000 AT acp3bf>

Hans-Bernhard Broeker wrote:
> 
> On Tue, 10 Feb 1998, Vik Heyndrickx wrote:
> 
> > - EOF is an element of the 'signed char' range which means that no
> > matter what trickery is applied, only 256 distinct values can be
> > represented of which EOF is one.
> 
> > This has as a consequence that locales
> > that have a real character defined for value (char)255 (i.e. EOF) cannot
> > be supported by ANY is* macro's, no matter how smart implemented.
> 
> I may sound repetitive, but that's not the full truth on this: they *can*
> support it. 

No, they cannot! This will sound mathematical, but you asked for it. The
domain for the is* macro's contains only 256 elements. As these is*
macro's are also functions in the mathematical sense of the word, they
cannot produce 257 different outcomes (pigeon hole principle), i.e. the
image of the function contains at most 256 elements, otherwise they
wouldn't be functions.

I know why you disagree, but consider this:
int c = getc (f);

How on earth, should the user know whether the returned char is EOF or
the real character (char)255 defined in the non-C-locale.

>             The only thing they can't do is to magically fix up
> non-portable programs that blindly call isalpha(c) for a 'char' value,
> instead of the correct isalpha((unsigned char)c). As I consider such
> programs to be buggy, I fail to see a problem with not supporting them.

BTW you should have noted that my first concern is changing the default
for 'char', as all problems I cited are primarily caused by 'signed'.

> I fully agree with Eli: we shouldn't change such a rather fundamental
> design decision just to make broken user program turn un-broken.

As I said to Eli, not "just to".

> > - many users do not expect that '`' (the Greek letter alpha with EASCII
> > value 224 in DOS CP 437 or 850) is not equal to 224. IMHO, this is
> > strongly counterintuitive. This triggers unexpected (and never wanted)
> > outputs in printf("%d", ...) and printf ("%u", ...) statements.
> 
> I don't think we should pay that much attention to what users _expect_,
> even more so if they expect the wrong things... E.g., I can't see any good
> reason why anyone should 'printf("%d", '`'). If you want a char to
> represent a *number*, then why (other than for the IOCCC)  would you want
> to assign a value in '' notation to it? Users who mix up the use of 'char'
> for actual characters, and for 'small integers' might as well get what
> they asked for.

Consider (newbies are found to do this, without any objection from my
part):
---
int c;
for (c = 'a'; c < '`' ;++c)
  printf ("%d", c);
---
It is much more intuitive to have this code snipped returned a set of
consecutive numbers.

> > - All DOS compilers that I know about (not many), use 'unsigned char' by
> > default. SGI uses 'unsigned char' ;)
> 
> But I think all of them can be told to use 'signed char' instead. Even
> way old Turbo C 2 could.

As can djgpp, but users usually don't do that by default. In order to
maintain full compatibility with programs written specifically for those
compilers, you should only look at the default behaviour.
 
> > - A char can be used as an array subscript, especially in translation
> > tables. Most of the time (99%) the user does not expect that this value
> > can be negative.
> 
> I doubt you have enough statistical data to justify such a claim :-)

I have seen enough code, which can confirm this statement, unfortunately
I forgot to make a sheet with "do"'s/"don't"'s ;-)

-- 
 \ Vik /-_-_-_-_-_-_/   
  \___/ Heyndrickx /          
   \ /-_-_-_-_-_-_/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019