Mailing-List: contact cygwin-developers-help AT sourceware DOT cygnus DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-developers-owner AT sources DOT redhat DOT com Delivered-To: mailing list cygwin-developers AT sources DOT redhat DOT com Date: Thu, 6 Sep 2001 21:24:50 +0200 From: Corinna Vinschen To: cygwin-developers AT cygwin DOT com Subject: Re: Figured out how to reproduce vfork/rsync bug! Message-ID: <20010906212450.B13680@cygbert.vinschen.de> Reply-To: cygdev Mail-Followup-To: cygwin-developers AT cygwin DOT com References: <20010906142836 DOT 7323 DOT qmail AT lizard DOT curl DOT com> <20010906164756 DOT 19885 DOT qmail AT lizard DOT curl DOT com> <20010906203947 DOT Q537 AT cygbert DOT vinschen DOT de> <20010906184747 DOT 20476 DOT qmail AT lizard DOT curl DOT com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010906184747.20476.qmail@lizard.curl.com>; from jik@curl.com on Thu, Sep 06, 2001 at 02:47:47PM -0400 On Thu, Sep 06, 2001 at 02:47:47PM -0400, Jonathan Kamens wrote: > > Date: Thu, 6 Sep 2001 20:39:47 +0200 > > From: Corinna Vinschen > > > > Nothing special. It just said BOOOOM! > > Ah. Did I forget to mention that sometimes this bug kills *all* your > cygwin processes, not just the processes that fail. Sorry about that > :-). You dare! ;-) > Incidentally, I've managed to isolate it to something that was checked > in between July 16 and July 17 (i.e., between "cvs update -D7/16/2001" > and "cvs update -D7/17/2001"; I'm not sure whether that uses midnight > local time, time on the CVS server or GMT). I'll let you know as I It uses your local time while the timestamps in cygwin-cvs are always PST (PDT, currently). I for one have to add 9 hours to get localtime. > find out more. > > > I still have to examine the stackdump... > > You mean it's actually possible to derive useful information from > those stackdumps? I've not been able to find anybody here who knows > how to do that. Do tell. Ok, here we go. The stackdump I got was created by the application which raised the exception, here `rsync', so I got a new file `rsync.exe.stackdump'. Let's look into it's contents: $ cat rsync.exe.stackdump Exception: STATUS_ACCESS_VIOLATION at eip=61024931 eax=FFFFFFFF ebx=614F020C ecx=0242FF08 edx=02436720 esi=610AD578 edi=610248D4 ebp=02420000 esp=0242F9EC program=C:\cygwin\bin\rsync.exe cs=001B ds=0023 es=0023 fs=003B gs=0000 ss=0023 Stack trace: Frame Function Args 56182 [main] rsync 2384 handle_exceptions: Error while dumping state (probably corrupted stack) Hmm, that's not that good. The stack is corrupted so the backtrace didn't work. Ok, let's try with another stackdump from another crash: Exception: STATUS_ACCESS_VIOLATION at eip=6109DD7D eax=00000001 ebx=0A017908 ecx=0000C008 edx=0000C009 esi=0A023908 edi=0A017900 ebp=0022F644 esp=0022F61C program=C:\cygwin\bin\sh.exe cs=001B ds=0023 es=0023 fs=003B gs=0000 ss=0023 Stack trace: Frame Function Args 0022F644 6109DD7D (610AB020, 0A017908, 614C3CC8, 6102CAD3) 0022F674 61036632 (0A017908, 00000124, 0022FF18, 77E9DCBE) 0022F6A4 610364B6 (0A017908, 00000000, 0022F6F4, 6102E54A) 0022F6C4 61065705 (00412984, 004129D4, 00230178, 00230178) 0022F6F4 6102E552 (0022F864, 0022F868, 0022F86C, 00000103) 0022F874 6102F5BB (00412984, 0022F8CC, 00000003, 61074E73) 0022F894 00406E84 (0A0178A0, 004129D4, 00000000, FFFFFFFF) 0022F8D4 004022B3 (00412970, 00000000, 0022F904, 61093102) 0022F904 00401D7D (00412970, 00000001, 0022FAC4, 0040B905) 0022F954 0040254A (00412970, 0022FA44, 0022FA54, 0040489B) End of stack trace (more stack frames may be present) What you see is the call stack of the crashing application at the moment the crash happened. The first few lines print the contents of the CPU registers, the rest is the list of functions currently on the stack. The uppermost function is the latest which has been called (and is actually the one in which the crash has happened), the next function is the function which called the first function and so forth. `Frame' means the position on the stack at which the local vars are stored, `Function' is the address in the function at which the next function has been called. For the uppermost function it's the crash address. `Args' are just the first 16 bytes of arguments to the function. To examine the addresses you can simply start gdb: $ gdb -nw /bin/sh GNU gdb 5.0 (20010428-1) Copyright 2001 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-cygwin"... (gdb) dll cygwin1.dll (gdb) disas 0x6109DD7D Dump of assembler code for function _free_r: 0x6109dd0c <_free_r>: Cannot access memory at address 0x6109dd0c Ok, now we at least know that this example crashed in free_r()... (gdb) disas 0x61036632 Dump of assembler code for function export_free: 0x610365e0 : Cannot access memory at address 0x610365e0 ...which has been called from export_free() etc. For the above `rsync' crash we have only minimal information in the stackdump, unfortunately. However, did you see that the content of the `eip' register is identical to the function address of the uppermost (crashing) function? So we can at least get the information in which function `rsync' crashed by asking gdb for the function of the address in `eip': (gdb) disas 0x61024931 Dump of assembler code for function fixup_after_fork__15fhandler_socketPv: 0x61024914 : Cannot access memory at address 0x61024914 Ok, so it's fhandler_socket::fixup_after_fork(). This should just give an idea. Examining the stackdump is not the same as live debugging with gdb. You'll get way more information _if_ the problem persists under debugger control... which isn't a matter of course :-( Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Developer mailto:cygwin AT cygwin DOT com Red Hat, Inc.