Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Date: Mon, 9 Feb 2004 22:49:19 -0800 (PST) From: "Peter A. Castro" To: peda AT sectra DOT se cc: cygwin AT cygwin DOT com Subject: Re: Segfault in _cygwin_dll_entry In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-IsSubscribed: yes On Mon, 9 Feb 2004 peda AT sectra DOT se wrote: > Peter A. Castro wrote: > > > In the case of zsh, it's completely cygwin stuff, no MS stuff. > > As is the case with LibGGI. The difference, althought it really doesn't matter, is that libzsh-4.1.1.dll was rebased, while cygggi-2.dll isn't. Something in the makeup of cygggi-2.dll causes the same condition as when libzsh-4.1.1.dll is rebased. > > > > >Is it a known problem? > > > > > > > > No. If nothing "obvious" turns up in your initial efforts to scope > the > > > > problem, you're probably going to be best off debugging into the > Cygwin > > > > DLL to see where it crashes. > > > > > > One obvious thing to check for is whether the application tries to > > > dynamically load a Cygwin-dependent DLL (which may result in > attempting to > > > load cygwin1.dll dynamically, and that is *not supported*). > > > > I have yet to fully understand just where the fault is, but I do know > > this: the .bss segment used by cygwin_dll_entry sometimes is not where > it > > thinks it it. > > > > I found this while debugging the zsh rebase problem, and so my methods > > are a little quirky :) > > > > First, rebase the libzsh-4.1.1.dll and start gdb of zsh.exe, then run > it. > > It'll break with a segfault occuring inside _cygwin_dll_entry AT 12. The > > specific instruction is at _cygwin_dll_entry AT 12+146: > > > > (gdb) disassemble > > 0x6ff40951 <_cygwin_dll_entry AT 12+129>: call 0x6ff41390 > > > 0x6ff40956 <_cygwin_dll_entry AT 12+134>: mov $0xffffffff,%eax > > 0x6ff4095b <_cygwin_dll_entry AT 12+139>: mov %eax,0x7fd98610 > > 0x6ff40960 <_cygwin_dll_entry AT 12+144>: jmp 0x6ff408fb > <_cygwin_dll_entry AT 12+43> > > 0x6ff40962 <_cygwin_dll_entry AT 12+146>: mov %ecx,0x7fd985e0 > > ~~~~~~~~~~ > > 0x6ff40968 <_cygwin_dll_entry AT 12+152>: mov $0x1,%eax > > 0x6ff4096d <_cygwin_dll_entry AT 12+157>: mov %eax,0x7fd985f0 > > 0x6ff40972 <_cygwin_dll_entry AT 12+162>: mov %edx,0x7fd98600 > > 0x6ff40978 <_cygwin_dll_entry AT 12+168>: movl $0x7fd908a0,0x4(%esp,1) > > 0x6ff40980 <_cygwin_dll_entry AT 12+176>: mov %ecx,(%esp,1) > > 0x6ff40983 <_cygwin_dll_entry AT 12+179>: call 0x6ff413a0 > > > > > So, what's up with 0x7fd985e0 ? gdb can't seem to resolve it nor access > > the address (hence the segfault): > > > > (gdb) info symbol 0x7fd985e0 > > No symbol matches 0x7fd985e0. > > (gdb) x/x 0x7fd985e0 > > 0x7fd985e0: Cannot access memory at address 0x7fd985e0 > > > > Ok, so restore the un-rebased libzsh-4.1.1.dll, start gdb of zsh, set a > > break point at main and run it. It'll stop at the break point, no > > faults. Now, get the address of _cygwin_dll_entry AT 12 and have a look at > > the same section of code: > > > > (gdb) info address _cygwin_dll_entry AT 12 > > Symbol "_cygwin_dll_entry AT 12" is at 0x600f08d0 in a file compiled > without debugging. > > (gdb) disassemble > > 0x600f0951 <_cygwin_dll_entry AT 12+129>: call 0x600f1390 > > > 0x600f0956 <_cygwin_dll_entry AT 12+134>: mov $0xffffffff,%eax > > 0x600f095b <_cygwin_dll_entry AT 12+139>: mov %eax,0x600f8610 > > 0x600f0960 <_cygwin_dll_entry AT 12+144>: jmp 0x600f08fb > <_cygwin_dll_entry AT 12+43> > > 0x600f0962 <_cygwin_dll_entry AT 12+146>: mov %ecx,0x600f85e0 > > ~~~~~~~~~~ > > 0x600f0968 <_cygwin_dll_entry AT 12+152>: mov $0x1,%eax > > 0x600f096d <_cygwin_dll_entry AT 12+157>: mov %eax,0x600f85f0 > > 0x600f0972 <_cygwin_dll_entry AT 12+162>: mov %edx,0x600f8600 > > 0x600f0978 <_cygwin_dll_entry AT 12+168>: movl $0x600f08a0,0x4(%esp,1) > > 0x600f0980 <_cygwin_dll_entry AT 12+176>: mov %ecx,(%esp,1) > > 0x600f0983 <_cygwin_dll_entry AT 12+179>: call 0x600f13a0 > > > > > (gdb) info symbol 0x600f85e0 > > storedHandle in section .bss > > (gdb) info address storedHandle > > Symbol "storedHandle" is at 0x600f85e0 in a file compiled without > debugging. > > (gdb) x/x 0x600f85e0 > > 0x600f85e0 : 0x00000000 > > > > Ah! So, in the un-rebased scenario storedHandle is in a .bss section. > > So, rebase libzsh-4.1.1.dll again, start gdb of zsh, and let it run. > > It'll break with a segfault, again, occuring inside > _cygwin_dll_entry AT 12. > > > > So, just where is storedHandle? > > > > (gdb) info address storedHandle > > Symbol "storedHandle" is at 0x6ff485e0 in a file compiled without > debugging. > > (gdb) info symbol 0x6ff485e0 > > storedHandle in section .bss > > (gdb) x/x 0x6ff485e0 > > 0x6ff485e0 : 0x00000000 > > > > Ah, but the code thinks storedHandle is at 0x7fd985e0 (which isn't > > addressable)! It turns out that 0x6ff485e0 is the same location this > > part of the .bss was loaded at in the non-rebased scenario. So, where > > did things get messed up? Did Windows load the section and pass a bogus > > section address to the dll or is there a bug in the fixup code, or did > > cygwin_dll_entry() resolve the handle to the address incorrectly? > > > > I've looked at the code for cygwin_dll_entry and it's straight forward > > enough, so I just don't see where things could have gone wrong. Is this > > perhaps a quirk of the C++ environment or have we perhaps found a > Windows > > bug? > > This indeed looks the same. Here's the disassembly from the segfault > in LibGGI. > > 0x00354b21 <_cygwin_dll_entry AT 12+129>: call 0x354cf0 > > 0x00354b26 <_cygwin_dll_entry AT 12+134>: mov $0xffffffff,%eax > 0x00354b2b <_cygwin_dll_entry AT 12+139>: mov %eax,0xf06a6060 > 0x00354b30 <_cygwin_dll_entry AT 12+144>: jmp 0x354acb > <_cygwin_dll_entry AT 12+43> > 0x00354b32 <_cygwin_dll_entry AT 12+146>: mov %ecx,0xf06a6030 > ~~~~~~~~~~ > 0x00354b38 <_cygwin_dll_entry AT 12+152>: mov $0x1,%eax > 0x00354b3d <_cygwin_dll_entry AT 12+157>: mov %eax,0xf06a6040 > 0x00354b42 <_cygwin_dll_entry AT 12+162>: mov %edx,0xf06a6050 > 0x00354b48 <_cygwin_dll_entry AT 12+168>: movl $0xf06a4a70,0x4(%esp,1) > 0x00354b50 <_cygwin_dll_entry AT 12+176>: mov %ecx,(%esp,1) > 0x00354b53 <_cygwin_dll_entry AT 12+179>: call 0x354d00 > > > and here's my version of the storedHandle stuff: > > (gdb) info address storedHandle > Symbol "storedHandle" is at 0x356030 in a file compiled without debugging. > (gdb) info symbol 0x356030 > storedHandle in section .bss > (gdb) x/x 0x356030 > 0x356030 : 0x00000000 > > The four last digits match for me as well, at least an indication... That's because the same _cygwin_dll_entry contains it's own bss which is loaded on a segment boundary, so the offset will always be the same (I think). > There's no C++ involved in any of the LibGGI dlls, I guess we can > rule that out. Or is there C++ in the cygwin1.dll? Yes, cygwin1.dll is written in C++, so there is the element of C++ runtime to contend with, however, I can't prove that's what's causing the problem. > You're talking about rebased dlls. I don't know if cygggi-2.dll is > rebased or not, how can I tell? It is relinked when libtool installs > it, but so is cyggii-0.dll. And cyggii-0.dll works when used alone. I kinda doubt your dlls have been rebased, unless you manually ran rebase or ran the rebaseall script. No, I think that rebaseing is helping to uncover the bug in the case of libzsh, but your dlls evoke the bug in their unrebase form. The next challenge is to trace cygwin startup code from the very beginning, including dll init code... > And please CC me on this subject, I'm not on the list. > > Regards, > Peter Ekberg -- Peter A. Castro or "Cats are just autistic Dogs" -- Dr. Tony Attwood -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/