Mail Archives: djgpp-workers/1998/02/12/07:49:05
Charles Marslett wrote:
> Well, Microsoft C compilers, Watcom C compilers and the traditional K&R
> compilers all default to signed. These are 3 of 4 major sources
> (Borland, I think, defaults to unsigned unless you use the IDE, then it defaults
> to whatever you used last, and Symantec defaults to unsigned -- but it has
> so many idiosynchracies that Symantec code is virtually a language of its
> own anyway). So signed is the traditional treatment if you want the
> code to be most portable (in or out of the PC world).
You do not mention enough examples of compilers to decide that 'signed
char' is most-portable, and even if it were so, it remains not-portable,
which is also true for 'unsigned char'.
> For that matter, the
> pre-ANSI compilers had no 'signed' keyword, so if the default was not
> signed, there was no mechanism to create a signed byte value.
>
> So the decision may be too historical or parochial, but not implicitly
> dumb.
IMHO there is no reason to support traditional C programs within an ANSI
compliant compiler, especially because this compiler can be instructed
to follow a specific compatibility rule. If traditional programs are to
be supported, they SHOULD either be converted to be fully portable under
ANSI complianceness or explicitely be compiled with '-fsigned-char' and
"-traditional". Maintaining those programs to run on pre-ANSI compilers
is IMHO not a requirement anymore since this standard has been defined
since a considerably long time. BTW I don't think such porting happens
often, and those programs were nearly always supposed to run on 7-bit
character systems.
> > This unexpected effect is only understandable if a user thinks that
> > char type is somehow ``magical'' because it represents printable
> > characters. But that is not how C defines them: in C they are just
> > small integers.
> >
> > Also, ANSI allows an index of -1.
>
> As a matter of fact several 68K compilers I used (a decade ago, I have
> to admit), used 257 entry tables for the is* macros for exactly this
> reason. If called with an unsigned char (or int) value, the full 256
> character set was supported, but if called with a default (signed) char
> value, only 7-bit ASCII was supported. And of course, traditional ASCII
> is a 7-bit code, so "the user does not expect that this value can be
> negative" is a reasonable interpretation of ASCII characters in signed
> bytes. Of course, ISO and Unicode characters do not always fit into
> the 7 bits of a signed positive byte (Unicode doesn't even fit in
> an unsigned byte).
As Unicode nearly isn't even able to fit anymore in a 16-bit value
(!!!), it is not at order here. But ISO-8859.1 is, and the ID-code of
each character is an unsigned number which does currently NOT fit in a
char.
--
\ Vik /-_-_-_-_-_-_/
\___/ Heyndrickx /
\ /-_-_-_-_-_-_/
- Raw text -