Mail Archives: cygwin/2015/07/06/09:16:04
Hi Corinna,
On 7/6/2015 6:01 AM, Corinna Vinschen wrote:
> Hi Ken,
>
>
> thanks for further testing this.
>
>
> On Jul 5 22:15, Ken Brown wrote:
>> On 7/5/2015 5:34 PM, Corinna Vinschen wrote:
>>> This test release needs some good testing!
>>
>> I repeated the emacs experiment discussed in the "[ANNOUNCEMENT] TEST
>> RELEASE: Cygwin 2.1.0-0.1" thread. In the 32-bit case, the results were
>> more-or-less the same as before: I forced a stack overflow, emacs recovered,
>> I tried to continue working, there was a second SIGSEGV, and handle_sigsegv
>> bailed out because garbage collection was in progress. This time I was
>> unable to prevent the second SIGSEGV by resetting max-specpdl-size and
>> max-lisp-eval-depth. I'm not sure what caused the second SIGSEGV, but it
>> might have nothing to do with Cygwin.
>>
>> In the 64-bit case, however, the recovery from stack overflow never happened
>> (i.e., the program never reached the siglongjmp). Here's a gdb session:
>> [...]
>> 1647 if (!getrlimit (RLIMIT_STACK, &rlim))
>> (gdb)
>> 1656 beg = stack_bottom;
>> (gdb)
>> 1657 end = stack_bottom + stack_direction * rlim.rlim_cur;
>> (gdb)
>> 1658 if (beg > end)
>> (gdb)
>> 1660 addr = (char *) siginfo->si_addr;
>> (gdb)
>> 1663 if (beg < addr && addr < end
>> (gdb) p beg
>> $1 = 0x82ca27 ""
>> (gdb) p addr
>> $2 = 0x33ff8 ""
>
> I can't reproduce this. It works fine for me. For reference I attached
> my simplified testcase again. It's basically the emacs SIGSEGV setup,
> main triggers the stack overflow, the handler tries to write a file for
> testing if that works from the handler, then it siglongjmps. The main
> function tests if it can still fork, and then it repeats the action to
> test if we're back to normal in terms of signal handling.
>
> If it works (and it does for me) the output looks like this:
>
> $ ./sigalt
> command loop 1 before crash
> command loop 1 after crash
> In child
> In parent
> command loop 2 before crash
> command loop 2 after crash
> In child
> In parent
>
> On W8.1 for a standard GCC build of this testcase I get:
>
> (gdb) p beg
> $1 = 0x40ac3 <error: Cannot access memory at address 0x40ac3>
> (gdb) p addr
> $2 = 0x43848 <error: Cannot access memory at address 0x43848>
> (gdb) p end
> $3 = 0x23cac3 ""
> (gdb) p/x rlim.rlim_cur
> $5 = 0x1fc000
>
> Check default stacksize:
>
> )$ peflags -x ./sigalt
> ./sigalt: stack reserve size : 2097152 (0x200000) bytes
>
> 0x200000 - dead zone 4K - default W8.1 64 bit guardpagesize 3 * 4K ==
> 0x1fc000, the value rlim.rlim_cur returns. Looks good to me.
>
> On W8.1 32 bit under WOW:
>
> (gdb) p beg
> $1 = 0x8fc33 ""
> (gdb) p addr
> $2 = 0x92d5c <error: Cannot access memory at address 0x92d5c>
> (gdb) p end
> $3 = 0x28cc33 ""
> (gdb) p/x rlim.rlim_cur
> $4 = 0x1fd000
>
> $ peflags -x ./sigalt
> ./sigalt: stack reserve size : 2097152 (0x200000) bytes
>
> 0x200000 - dead zone 4K - default W8.1 32 bit guardpagesize 2 * 4K ==
> 0x1fd000.
>
> On W7 32 bit native:
>
> (gdb) p beg
> $1 = 0x2ec43 "\376\356..."
> (gdb) p addr
> $2 = 0x32d6c ""
> (gdb) p end
> $3 = 0x22cc43 ""
> (gdb) p rlim.rlim_cur
> $4 = 2088960
> (gdb) p/x rlim.rlim_cur
> $5 = 0x1fe000
>
> $ peflags -x ./sigalt
> ./sigalt: stack reserve size : 2097152 (0x200000) bytes
>
> 0x200000 - dead zone 4K - default W7 32 bit guardpagesize 1 * 4K ==
> 0x1fe000.
>
>> Note that addr < beg, so we never reach the siglongjmp.
>
> I have no explanation for this. What OS? What does rlim_cur contain?
> What does peflags -x print for this executable?
I'm on W7 64-bit. The problem seems to be that rlim_cur is too big.
$ peflags -x ./emacs
./emacs: stack reserve size : 8388608 (0x800000) bytes
(gdb) p beg
$3 = 0x82ca27 ""
(gdb) p/x rlim.rlim_cur
$2 = 0x850e80
So there's overflow when end is computed:
(gdb) p end
$4 = 0xfffffffffffdbba7 <error: Cannot access memory at address 0xfffffffffffdbba7>
This doesn't happen when I run your testcase with the same 8MB stack size:
$ peflags -x0x800000 ./sigalt.exe
./sigalt.exe: stack reserve size : 8388608 (0x800000) bytes
(gdb) p beg
$1 = 0x82cabb ""
(gdb) p/x rlim.rlim_cur
$2 = 0x7fd000
(gdb) p end
$3 = 0x2fabb
> And last but not least, what is emacs doing there? The stack should be
> pretty much in a good shape when it's back to the main loop. The stack
> is fully commited and has the default number of guardpages at the bottom,
> as it is just short of the stack overflow.
>
> For debugging purposes I also added a global variable called "tib" and a
> memory info struct called "m" to the testcase which are initialized
> right at the start of main. tib points to the start of the TEB (Thread
> Environment Block, a Windows per-thread bookkeeping structure) of the
> main thread. If you expand it right after it's fetched, you get
> something along these lines:
>
> (gdb) p *tib
> $2 = {ExceptionList = 0x22cd78, StackBase = 0x230000, StackLimit = 0x20c000,
> SubSystemTib = 0x0, {FiberData = 0x1e00, Version = 7680},
> ArbitraryUserPointer = 0x0, Self = 0x7ffdf000}
>
> Note the values of StackBase and StackLimit and compare with your beg and
> end values. StackBase is the upper limit of the stack. It grows downward
> from there. StackLimit is the lowest address as yet commited. It's not much
> yet as you can see, 0x230000-0x20c000 == 0x24000 == 144K. Since Cygwin
> executables have a default stack of 2 Megs, the allocation base of the stack
> is probably at 0x30000. This can be checked by looking at m:
>
> (gdb) p m
> $1 = {BaseAddress = 0x22c000, AllocationBase = 0x30000, AllocationProtect = 4,
> RegionSize = 16384, State = 4096, Protect = 4, Type = 131072}
>
> See the value of AllocationBase.
>
> When you hit the breakpoint in handle_sigsegv, the output of tib should
> look like this:
>
> (gdb) p *tib
> $2 = {ExceptionList = 0x22cd78, StackBase = 0x230000, StackLimit = 0x32000,
> SubSystemTib = 0x0, {FiberData = 0x1e00, Version = 7680},
> ArbitraryUserPointer = 0x0, Self = 0x7ffdf000}
>
> Observe the value of StackLimit. For this output I ran the testcase on
> W7 32 bit. It has a default guardpage of 4K. The new wrapper I wrote
> in Cygwin restored the stack to its state rifght before the stack overflow
> occured:
>
> - At 0x30000 we have the 4K dead zone, which is always only reserved,
> never commited.
>
> - At 0x31000 the 4K guard page starts.
>
> - Thus the StackLimit (the start of the commited region of the stack)
> starts at 0x32000.
>
> You can utilize tib and m for testing in emacs as well. Just do this:
>
> #include <windows.h>
>
> NT_TIB *tib;
> MEMORY_BASIC_INFORMATION m;
>
> [...]
>
> in main:
>
> /* Record (approximately) where the stack begins. */
> stack_bottom = &stack_bottom_variable;
> tib = (NT_TIB *) __readfsdword(PcTeb);
> VirtualQuery (stack_bottom, &m, sizeof m);
I'll try this next and report back.
Ken
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
- Raw text -