From: Hans-Bernhard Broeker Newsgroups: comp.os.msdos.djgpp Subject: Re: dead beef Date: 19 Apr 2000 10:50:21 GMT Organization: Aachen University of Technology (RWTH) Lines: 83 Message-ID: <8dk31d$m3n$1@nets3.rz.RWTH-Aachen.DE> References: <38FC4A45 DOT 54C24CDF AT bigfoot DOT com> <8dhrpn$q3s$1 AT nets3 DOT rz DOT RWTH-Aachen DOT DE> <38FCB4D5 DOT B3BB6044 AT bigfoot DOT com> NNTP-Posting-Host: acp3bf.physik.rwth-aachen.de X-Trace: nets3.rz.RWTH-Aachen.DE 956141421 22647 137.226.32.75 (19 Apr 2000 10:50:21 GMT) X-Complaints-To: abuse AT rwth-aachen DOT de NNTP-Posting-Date: 19 Apr 2000 10:50:21 GMT Originator: broeker@ To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com J.P. Morris wrote: > Hans-Bernhard Broeker wrote: [Using the zeroing sbrk(), you get a NULL pointer dereference, which is different behaviour from what the default, or the 'deadbeef' one yield. ] >> You should have tried that in a debugger, and checked where this NULL >> pointer came from, to find the bug. > Assuming I can get it to compile so that it will work inside the > debugger, what then? I've only ever used debuggers with faults > that occur every time, not intermittent ones. It doesn't matter all that much. Just run the program inside the debugger. When the program would normally crash due to SIGSEGV or NULL pointer dereference, the debugger will fire up, with the necessary information about variables and source code positions so you can see *which* pointer variable is bad, and what value it has, and how the execution flow of the program got to that place. Now, as Eli already pointed out, you'll want to repeat the program's execution, but this time, set a watchpoint on the pointer variable that contained the invalid data (NULL, or garbage), in the first run. I'd advise using the 0xdeadbeef method, and maybe make the watchpoint conditional so it only really triggers if the new value of the pointer is really the 'dead beef' one. > Since I will only know the bug has occurred when it has already happened, > how can I trace this kind of bug? If the bug is anywhere near reproduceable, things like breakpoints with ignore-counts can be useful. I.e. if you know that the bug happens in a certain routine, but only on the 12345th invocation of it, you can break routine # let's say the breakpoint got number 11 ignore 11 12344 run To find the right number of invocations, set the ignore count to a very high number, and once the crash has happened, use 'info break' to see how many times the breakpoint was ignored, before the crash. [...] > CHECK_OBJECT(object); // This bombs out if object pointer is invalid > move_object(object,10,10); > Now, the worst thing is that quite often, the pointer passes the > CHECK_OBJECT() test OK, but when it reaches move_object(), it has > turned into 0x203206 or something. That would hint at CHECK_OBJECT() as the culprit. It may be modifying the value of 'object', in certain, rare cases. > Also, I have just found, the program works without crashing in a DOS > box, but crashes in pure DOS. None of the classic causes in the FAQ > seem to be the problem, unless I'm missing something. That's expected behaviour if you used the zeroing sbrk(). The resulting NULL pointer dereference is not caught by the Windows DPMI host. CWSDPMI, on the other hand, will catch it, so that's where that difference would come from. And this _is_ in the FAQ, unless my memory is betraying me even beyond it's usual lossyness. >> Right. To detect overruns or underruns in arrays not coming from >> malloc() (i.e. automatic ones on the stack, or static ones), you need >> other tools. [...] > I'll try that. If there is a problem like this it should find it > even if the game doesn't crash in linux. Well, actually the way your bug changes behaviour as you switch from one version of sbrk() to the other, the actual source of the bug must at least partly be related to malloc()ed storage. It's using the 'random garbage' from malloc()ed blocks that haven't been filled with anything sensible, yet. The only truly puzzling part of this is that the crash manages to avoid happening for such a long runtime of the program. -- Hans-Bernhard Broeker (broeker AT physik DOT rwth-aachen DOT de) Even if all the snow were burnt, ashes would remain.