Sender: vheyndri AT rug DOT ac DOT be Message-Id: <34E2CF49.1454@rug.ac.be> Date: Thu, 12 Feb 1998 11:30:33 +0100 From: Vik Heyndrickx Mime-Version: 1.0 To: charles DOT marslett AT tempe DOT vlsi DOT com Cc: Eli Zaretskii , djgpp-workers AT delorie DOT com Subject: Re: char != unsigned char... sometimes, sigh (long) References: <34E1F2BE DOT 24BB AT tempe DOT vlsi DOT com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Precedence: bulk Charles Marslett wrote: > Well, Microsoft C compilers, Watcom C compilers and the traditional K&R > compilers all default to signed. These are 3 of 4 major sources > (Borland, I think, defaults to unsigned unless you use the IDE, then it defaults > to whatever you used last, and Symantec defaults to unsigned -- but it has > so many idiosynchracies that Symantec code is virtually a language of its > own anyway). So signed is the traditional treatment if you want the > code to be most portable (in or out of the PC world). You do not mention enough examples of compilers to decide that 'signed char' is most-portable, and even if it were so, it remains not-portable, which is also true for 'unsigned char'. > For that matter, the > pre-ANSI compilers had no 'signed' keyword, so if the default was not > signed, there was no mechanism to create a signed byte value. > > So the decision may be too historical or parochial, but not implicitly > dumb. IMHO there is no reason to support traditional C programs within an ANSI compliant compiler, especially because this compiler can be instructed to follow a specific compatibility rule. If traditional programs are to be supported, they SHOULD either be converted to be fully portable under ANSI complianceness or explicitely be compiled with '-fsigned-char' and "-traditional". Maintaining those programs to run on pre-ANSI compilers is IMHO not a requirement anymore since this standard has been defined since a considerably long time. BTW I don't think such porting happens often, and those programs were nearly always supposed to run on 7-bit character systems. > > This unexpected effect is only understandable if a user thinks that > > char type is somehow ``magical'' because it represents printable > > characters. But that is not how C defines them: in C they are just > > small integers. > > > > Also, ANSI allows an index of -1. > > As a matter of fact several 68K compilers I used (a decade ago, I have > to admit), used 257 entry tables for the is* macros for exactly this > reason. If called with an unsigned char (or int) value, the full 256 > character set was supported, but if called with a default (signed) char > value, only 7-bit ASCII was supported. And of course, traditional ASCII > is a 7-bit code, so "the user does not expect that this value can be > negative" is a reasonable interpretation of ASCII characters in signed > bytes. Of course, ISO and Unicode characters do not always fit into > the 7 bits of a signed positive byte (Unicode doesn't even fit in > an unsigned byte). As Unicode nearly isn't even able to fit anymore in a 16-bit value (!!!), it is not at order here. But ISO-8859.1 is, and the ID-code of each character is an unsigned number which does currently NOT fit in a char. -- \ Vik /-_-_-_-_-_-_/ \___/ Heyndrickx / \ /-_-_-_-_-_-_/