Mail Archives: djgpp/2000/04/19/04:32:36
On Tue, 18 Apr 2000, J.P. Morris wrote:
> Assuming I can get it to compile so that it will work inside the
> debugger, what then?
I already wrote what I suggest to do with a debugger. Let me
elaborate:
Step 1: Make sure your crashes happen when the program runs inside
the debugger. To this end, simply say "gdb your-program",
then type "run <whatever-arguments-you-need>" and see if the
debugger says "Program got signal SIGSEGV" at some point.
Step 2: Find out what pointers get garbled. If you already know
that, skip this step. If not, use the crash traceback(s) to
find the register(s) which hold garbled pointers, then find
out what variables correspond to those registers. Section
12.2 in the FAQ has more info about this.
Step 3: Put a watchpoint on one or more of the offending pointer
variables and run the program. Inside GDB, typing the
command "watch foo" will interrupt the program each time the
variable `foo' changes its value. This will only work
efficiently if normally the pointers involved in this are
not changed too frequently, so that the program could run at
its normal speed or thereabouts.
Do NOT put more than 4 watchpoints, because GDB cannot watch
more than 16 bytes with hardware-assisted watchpoints (x86
has only 4 debuge registers).
Step 4: Wait for the watchpoints to trigger, and when they do, GDB
will show you the line of code which overwrote that pointer.
> I've only ever used debuggers with faults
> that occur every time, not intermittent ones. Stepping through the
> code will not be practical due to the sheer volume of code run each
> cycle of the game loop, most of which takes place inside a VM.
What VM is that?
Anyway, watchpoints are precisely the ``silver bullet'' that's
supposed to help you find these bugs, where some unknown code writes
to an address it isn't supposed to.
> CHECK_OBJECT(object); // This bombs out if object pointer is invalid
> move_object(object,10,10);
>
> Now, the worst thing is that quite often, the pointer passes the
> CHECK_OBJECT() test OK, but when it reaches move_object(), it has
> turned into 0x203206 or something.
>
> Does this suggest anything?
It suggests that move_object, or one of its subroutines, is the
culprit. Perhaps it overwrites some array, or frees an object that is
still used by some other code after it is already free'd.
> Also, I have just found, the program works without crashing in a DOS box,
> but crashes in pure DOS. None of the classic causes in the FAQ seem to
> be the problem, unless I'm missing something.
You still haven't posted a single crash message. Why? It's possible
there are valuable hints there that you are overlooking. Please don't
hide information from us.
- Raw text -