delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/1998/02/12/07:49:05

Sender: vheyndri AT rug DOT ac DOT be
Message-Id: <34E2CF49.1454@rug.ac.be>
Date: Thu, 12 Feb 1998 11:30:33 +0100
From: Vik Heyndrickx <Vik DOT Heyndrickx AT rug DOT ac DOT be>
Mime-Version: 1.0
To: charles DOT marslett AT tempe DOT vlsi DOT com
Cc: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>, djgpp-workers AT delorie DOT com
Subject: Re: char != unsigned char... sometimes, sigh (long)
References: <Pine DOT SUN DOT 3 DOT 91 DOT 980211114656 DOT 15677A-100000 AT is> <34E1F2BE DOT 24BB AT tempe DOT vlsi DOT com>

Charles Marslett wrote:
> Well, Microsoft C compilers, Watcom C compilers and the traditional K&R
> compilers all default to signed.  These are 3 of 4 major sources
> (Borland, I think, defaults to unsigned unless you use the IDE, then it defaults
> to whatever you used last, and Symantec defaults to unsigned -- but it has
> so many idiosynchracies that Symantec code is virtually a language of its
> own anyway).  So signed is the traditional treatment if you want the
> code to be most portable (in or out of the PC world).  

You do not mention enough examples of compilers to decide that 'signed
char' is most-portable, and even if it were so, it remains not-portable,
which is also true for 'unsigned char'.

> For that matter, the
> pre-ANSI compilers had no 'signed' keyword, so if the default was not
> signed, there was no mechanism to create a signed byte value.
> 
> So the decision may be too historical or parochial, but not implicitly
> dumb.

IMHO there is no reason to support traditional C programs within an ANSI
compliant compiler, especially because this compiler can be instructed
to follow a specific compatibility rule. If traditional programs are to
be supported, they SHOULD either be converted to be fully portable under
ANSI complianceness or explicitely be compiled with '-fsigned-char' and
"-traditional". Maintaining those programs to run on pre-ANSI compilers
is IMHO not a requirement anymore since this standard has been defined
since a considerably long time. BTW I don't think such porting happens
often, and those programs were nearly always supposed to run on 7-bit
character systems.

> > This unexpected effect is only understandable if a user thinks that
> > char type is somehow ``magical'' because it represents printable
> > characters.  But that is not how C defines them: in C they are just
> > small integers.
> >
> > Also, ANSI allows an index of -1.
> 
> As a matter of fact several 68K compilers I used (a decade ago, I have
> to admit), used 257 entry tables for the is* macros for exactly this
> reason.  If called with an unsigned char (or int) value, the full 256
> character set was supported, but if called with a default (signed) char
> value, only 7-bit ASCII was supported.  And of course, traditional ASCII
> is a 7-bit code, so "the user does not expect that this value can be
> negative" is a reasonable interpretation of ASCII characters in signed
> bytes.  Of course, ISO and Unicode characters do not always fit into
> the 7 bits of a signed positive byte (Unicode doesn't even fit in
> an unsigned byte).

As Unicode nearly isn't even able to fit anymore in a 16-bit value
(!!!), it is not at order here. But ISO-8859.1 is, and the ID-code of
each character is an unsigned number which does currently NOT fit in a
char.

-- 
 \ Vik /-_-_-_-_-_-_/   
  \___/ Heyndrickx /          
   \ /-_-_-_-_-_-_/


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019