Date: Mon, 22 Jan 2001 22:19:58 +0200 From: "Eli Zaretskii" Sender: halo1 AT zahav DOT net DOT il To: Martin Str|mberg Message-Id: <3405-Mon22Jan2001221957+0200-eliz@is.elta.co.il> X-Mailer: Emacs 20.6 (via feedmail 8.3.emacs20_6 I) and Blat ver 1.8.6 CC: djgpp-workers AT delorie DOT com In-reply-to: <200101221730.SAA15795@father.ludd.luth.se> (message from Martin Str|mberg on Mon, 22 Jan 2001 18:30:32 +0100 (MET)) Subject: Re: Debugging on 386 References: <200101221730 DOT SAA15795 AT father DOT ludd DOT luth DOT se> Reply-To: djgpp-workers AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: djgpp-workers AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk > From: Martin Str|mberg > Date: Mon, 22 Jan 2001 18:30:32 +0100 (MET) > > Well it seems gdb can go into loop if I insist on "n": Patient: Doctor, it hurts when I do like this! Doctor: Then don't do that. Seriously, though: this issue of debugging under signals is exceedingly tricky. It is a miracle it at all works (mostly, no thanks to me). GDB gets all the exceptions, whether in its own code or in the debuggee's code. For each exception, GDB must decide where did it originate from---non-trivial trick number one. If it decides that the exception originated from the debuggee, it needs to jump to the debuggee's exception code---non-trivial trick number two (how do you jump to code which is in exceptn.S and thus usually doesn't have any debug info?). So, if you need to debug a program and you know it will generate exceptions, like in this case, you should do everything to get out of the harm's way. Set the related signal(s) to nostop noprint, and don't step where you don't need to. Then pray. > -> r > Program received signal SIGEMT, Emulation trap. > 0x5535 in _control87 () > -> bt > #0 0x5535 in _control87 () > #1 0x2ec3 in _npxsetup () > #2 0x3337 in __crt1_startup () > -> c > Program received signal SIGEMT, Emulation trap. > 0x554f in _control87 () > -> bt > #0 0x554f in _control87 () > #1 0x2ec3 in _npxsetup () > #2 0x3337 in __crt1_startup () These two are expected: the startup code issues a couple of FP instructions, for the reasons I explained earlier. > -> c > Breakpoint 1, main (argc=1, argv=0x905d4) at analyse_ints.c:129 > 129 if( argc != 2) > -> bt > #0 main (argc=1, argv=0x905d4) at analyse_ints.c:129 > #1 0x3368 in __crt1_startup () > -> n > Exiting due to signal SIGFPE This one is not. What does "disassemble analyse_ints" print near the EIP of the breakpoint (0x1a64)? Do you see any FP instructions anywhere around that? > Coprocessor Error at eip=00001a64, x87 status= > Program received signal SIGEMT, Emulation trap. > 0x9611 in _status87 () > -> bt > #0 0x9611 in _status87 () > #1 0x47da in do_faulting_finish_message () > #2 0x4d13 in __djgpp_traceback_exit () > #3 0x4da0 in raise () > #4 0x2c3a in nofpsig () > #5 0x4daa in raise () > #6 0x4e07 in __djgpp_exception_processor () > #7 0x1 in ?? () > #8 0x3368 in __crt1_startup () This is expected: the code which prints the traceback calls _status87. But what is that 0x1 on the stack? > -> n > Single stepping until exit from function _status87, > which has no line number information. > Exiting due to signal SIGFPE I suspect that this happens because GDB is single-stepping the program. Doing so near DPMI calls is another non-trivial trick, because you cannot have the TF flag set when issuing an INT xx instruction. Or maybe there's another bug. The complexity of what happens there is really mind-boggling. > -> c > 00c1 > eax=000000c1 ebx=00000010 ecx=00000000 edx=0004fa10 esi=00000000 edi=00010167 > ebp=0008f874 esp=0008f83c program=F:\HACKERY\STAT\NEW_STAT\ANALYSE_ > cs: sel=0167 base=104c0000 limit=0009ffff > ds: sel=016f base=104c0000 limit=0009ffff > es: sel=016f base=104c0000 limit=0009ffff > fs: sel=015f base=0004fa10 limit=00003fff > gs: sel=017f base=00000000 limit=0010ffff > ss: sel=016f base=104c0000 limit=0009ffff > App stack: [000901ac..000101ac] Exceptn stack: [00010120..0000e1e0] > > Call frame traceback EIPs: > 0x00009616 __status87+6 Not bad at all: it eventually got to printing the crash message and exiting the program ``almost normally''. You are welcome to work on this, if you feel like it. I'd be happy to give directions if you do.