From: jjf AT bcs DOT org DOT uk (J. J. Farrell)
Subject: Re: some unusual errors
23 Sep 1998 08:34:35 -0700
Message-ID: <199809222130.OAA02403.cygnus.gnu-win32@aleph.ssd.hal.com>
References: <3606527E DOT 2781 AT delorie DOT com>
Content-Type: text
To: dj AT delorie DOT com (DJ Delorie)
Cc: gnu-win32 AT cygnus DOT com

> From: DJ Delorie <dj AT delorie DOT com>
> 
> > This looks like a bug in the macro implementation of
> > isspace(), to me; it should be casting the parameter to
> > int so it will behave in the same way as the function
> > implementation. Until a fix appears, your options are:
> 
> This discussion happens a lot in the djgpp newsgroup.  The
> result is always the same: You can't cast to int in the
> macro.  Why?  Because:

No - you're missing the point. isspace() is defined in terms
of a prototyped function which takes an int. If I were using
the function form instead of the macro form and passed a char
as the parameter, that char (or other compatible arithmetic
type) would be silently and automatically promoted to an int
before the function was invoked, exactly as if it had been
assigned to an int. There's little reason why the macro
shouldn't include an explicit cast to int which would have
almost exactly the same effect as the automatic 'cast' in
the function case.

> 1. If the programmer used the macro with a char argument
> (when char is signed), the macro can't tell the difference
> between EOF and char 0xff

While this is certainly true, I don't see what it has to do
with whether or not the macro can include a cast. Assuming
8-bit chars, 0xff in a signed char is a negative number.
Unless it's value happens to equal EOF, it breaks the
constraints on allowed values for the ctype functions, and
unleashes undefined behaviour. Whether or not it has been
promoted to an int is irrelevant.

> 2. If char is signed, a cast to int may not do what the
> macro is expecting.  The is*() macros normally expect a
> parameter in the range 0..255 or -1, but if you cast a
> signed char to int, you get values in the range -128..127 or
> -1, and your program may crash.

Again, all true but I don't see the relevance. Signed 8-bit
2's complement chars already have values in the range -128
to 127; casting to an int doesn't change these values.

> 3. Remember that getchar() returns in int - for the very
> same reason, so that EOF is not in the range of valid
> characters.  getchar() returns EOF or 0..255, which is NOT
> the same range as the range for type `signed char'.

I think we may be at cross purposes. Again this is all true
(except that there's no guarantee that EOF will be outside
the range of valid characters), but doesn't have any relevance
to whether or not a macro implementation of isspace() can or
should cast its parameter to an int.

> Basically, if you're a programmer and you've stumbled onto
> this problem, you have a problem with your code.

The cases you're talking about are bugs anyway - whether or
not the parameter gets promoted to an int is irrelevant.
Chars are just small integers; in the usual context of this
list, signed chars have values in the range -128 to 127. The
values passed to the ctype macros must be in the range 0 to
UCHAR_MAX (255 in this context) or EOF which is some specified
negative number. The programmer is required to ensure that
the parameter has a value in this range. As long as the value
is acceptable, he can put it in an int or any arithmetic type
he fancies which is smaller than int - including a signed char.
That would be correct input to the isspace() function, and
ought not to result in spurious messages from the compiler if
the macro version is used instead of the function. The best
way to make sure such messages don't appear is for the macro
to convert the parameter to an int.

Whether or not it is sensible to hold characters in signed
chars is an entirely different question. It is probably not
a good idea in general, especially if you want to use the
ctype functions on them. However, if I happen to know that
the values held in my signed chars are within the range which
is acceptable to the ctype functions, then there's no reason
why I can't pass them directly to the functions; I shouldn't
end up with confusing messages about using a char as an array
subscript!

The only problem with including a cast in the macro is that it
may suppress warnings if the macro is called with a type which
is not compatible with int. A better (but slightly more complex)
solution would be for the macro to define a working int and
assign the parameter into the int before doing anything else.
I should think the compiler ought to be able to optimise out
the inefficiencies of this.

-
For help on using this list (especially unsubscribing), send a message to
"gnu-win32-request AT cygnus DOT com" with one line of text: "help".