Message-ID: <34E1F2BE.24BB@tempe.vlsi.com> Date: Wed, 11 Feb 1998 11:49:34 -0700 From: Charles Marslett Reply-To: charles DOT marslett AT tempe DOT vlsi DOT com Organization: VLSI Technology, Inc. MIME-Version: 1.0 To: Eli Zaretskii CC: djgpp-workers AT delorie DOT com Subject: Re: char != unsigned char... sometimes, sigh (long) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Precedence: bulk Eli Zaretskii wrote: > > On Tue, 10 Feb 1998, Vik Heyndrickx wrote: > > > - All DOS compilers that I know about (not many), use 'unsigned char' by > > default. SGI uses 'unsigned char' ;) > > It would be interesting to know why did GCC choose signed char for > x86. Does anybody know? Should we ask the GCC maintainers? Or maybe > somebody can tell what are the advantages of signed char? > > The reason I think this would be educational is that Vik lists so many > disadvantages of this choice, it almost makes you think GCC is dumb. Well, Microsoft C compilers, Watcom C compilers and the traditional K&R compilers all default to signed. These are 3 of 4 major sources (Borland, I think, defaults to unsigned unless you use the IDE, then it defaults to whatever you used last, and Symantec defaults to unsigned -- but it has so many idiosynchracies that Symantec code is virtually a language of its own anyway). So signed is the traditional treatment if you want the code to be most portable (in or out of the PC world). For that matter, the pre-ANSI compilers had no 'signed' keyword, so if the default was not signed, there was no mechanism to create a signed byte value. So the decision may be too historical or parochial, but not implicitly dumb. > > - A char can be used as an array subscript, especially in translation > > tables. Most of the time (99%) the user does not expect that this value > > can be negative. > > This unexpected effect is only understandable if a user thinks that > char type is somehow ``magical'' because it represents printable > characters. But that is not how C defines them: in C they are just > small integers. > > Also, ANSI allows an index of -1. As a matter of fact several 68K compilers I used (a decade ago, I have to admit), used 257 entry tables for the is* macros for exactly this reason. If called with an unsigned char (or int) value, the full 256 character set was supported, but if called with a default (signed) char value, only 7-bit ASCII was supported. And of course, traditional ASCII is a 7-bit code, so "the user does not expect that this value can be negative" is a reasonable interpretation of ASCII characters in signed bytes. Of course, ISO and Unicode characters do not always fit into the 7 bits of a signed positive byte (Unicode doesn't even fit in an unsigned byte). > > - If the user want his program to behave in an implementation specific > > way he can always specify "-funsigned-char" or "-fsigned-char" at the > > command line. > > Or even edit their lib/specs to make it the default.