Mail Archives: djgpp/2000/04/28/17:27:16
Eli Zaretskii <eliz AT is DOT elta DOT co DOT il> wrote in article
<39086612 DOT 15C4E78A AT is DOT elta DOT co DOT il>...
> Joel Hunsberger wrote:
> >
> > The problem was that this code extensively corrupted the stack
> > every time. It turns out that stating 255 as the buffer length
> > was causing an implicit conversion overflow when converted for
> > use in the string, and subsequently for use by cputs. Alas,
> > it came out as buffer length of -1, which caused manifest
> > stack corruption (for reasons I can only imagine.)
> >
> > When I reduce the (largely arbitrary) requirement to 127
> > for console line input... things are fine!
> >
> > No hints in the info documentation for cgets, unfortunately.,
>
> I don't understand: the library docs explicitly says that the first
character
> in the buffer is used as the buffer size. So what is missing?
>
> Do you mean to say that it was not known to you that the char data type
is
> signed, and that therefore 255 is actually -1?
>
Yes... That is essentially the root cause... (sorry I did not remember it
at
the time)... So, this problem occured due to a combination of oversights
on my part.
First, I defined a line buffer (arbitrarily) to be
char lbuf[256];
Why?,... I don't know, other than 256 is a tempting number to pick for
"paragraph" oriented memory architecture (and I wanted it to be more
than big enough.)
I might have detected the problem sooner if I had declared
"unsigned char lbuf[256];" because...
The problem occured when I made the assignment of the first character as
the
buffer length (as explained in the info for cgets...) I chose "255" so
there
would always be room for a terminating \0 for any user input... thus, I
laid the
trap for myself...
lbuf[0] = 255; /***** BIG SILENT PROBLEM when lbuf is char ****/
Because lbuf is (char) and not (unsigned char)... the compiler does an
implicit
conversion on 255 shove it into signed char space, interpreting it as -1
when it is stored as lbuf[0].
Now, the interesting part is that the implicit conversion "taking place" is
only
reported when the compiler switch -pedantic is ON. Without it, compilation
was silent and apparently succesful. However, when cgets receives a -1 as
the
buffer length, the stack gets corrupted... Data back from cgets is (in
fact)
okay (although the stack is corrupted) so the next return goes off in the
weeds.
If I had declared "unsigned char lbuf[256];" then the compiler would have
more quickly flagged that I was attempting to pass an "unsigned char *"
argument
when expecting "char *".
"> Do you mean to say that it was not known to you... "
Are you really that surprised?
Yes... I confess that I make many stupid errors coding C. In this
case I had worked with (char) on text for so long that I essentially forgot
the signifcant difference with (unsigned char). That is what I think might
be useful for others (or, more to the point, those who need help, like me).
So, (you asked) what is missing?... I needed to observe that (char) is
valid
only for values -127 to 127. That's all! The info says it all by showing
the
prototype as "char *cgets(char *_str);" Only a real newbie (like me)
would overlook that (char) is not able to accept 255!!
Thanks for asking... (I am guilty as charged! :-) gdb gets a real workout
when I try to code C.
(I will post more when I fall into another embarrasing pothole :-)
Joel Hunsberger
- Raw text -