Mail Archives: cygwin/2013/08/14/10:05:19
X-Recipient: | archive-cygwin AT delorie DOT com
|
DomainKey-Signature: | a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
|
| :list-unsubscribe:list-subscribe:list-archive:list-post
|
| :list-help:sender:message-id:date:from:mime-version:to:subject
|
| :references:in-reply-to:content-type:content-transfer-encoding;
|
| q=dns; s=default; b=pYDLquwa5/iF0vHhZmot3+yYSHvl+eq2Yci8ZcaUTkj
|
| vqfnT3xBrZplvKCpge8PQez0HXmdoY0Dw7pjNDh9l27KDQfHus98ONe+0r2huiFR
|
| j5IJLfdvzFiQXQ+CTR/2BX3CVGrfr7D4k7d84v2LOW8MvmpIcmaLuDKfUBaXLRkE
|
| =
|
DKIM-Signature: | v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
|
| :list-unsubscribe:list-subscribe:list-archive:list-post
|
| :list-help:sender:message-id:date:from:mime-version:to:subject
|
| :references:in-reply-to:content-type:content-transfer-encoding;
|
| s=default; bh=4nui1EHzzCjxJkWkKj/rDK8muPE=; b=Fh8Wxafhm5+rGvv8W
|
| UxzoRUgN38M4q6lHy6rE+mbiGj2rAGLf18E12pWmMzwSs6tKN3y02hbUoODcJ/YT
|
| lOV5a9cIZcPQflkmvE1IEr+M+tyGCEWifCAWdfX59lGc/toJqH6ko0uRnM/ypjZp
|
| Wwk+961wxOi8CCiQhXef1/3DzA=
|
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm
|
List-Id: | <cygwin.cygwin.com>
|
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com>
|
List-Archive: | <http://sourceware.org/ml/cygwin/>
|
List-Post: | <mailto:cygwin AT cygwin DOT com>
|
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
|
Sender: | cygwin-owner AT cygwin DOT com
|
Mail-Followup-To: | cygwin AT cygwin DOT com
|
Delivered-To: | mailing list cygwin AT cygwin DOT com
|
X-Spam-SWARE-Status: | No, score=-1.6 required=5.0 tests=AWL,BAYES_50,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_NO,RP_MATCHES_RCVD,SPF_NEUTRAL autolearn=ham version=3.3.2
|
Message-ID: | <520B8E7F.6060709@cs.utoronto.ca>
|
Date: | Wed, 14 Aug 2013 10:04:47 -0400
|
From: | Ryan Johnson <ryan DOT johnson AT cs DOT utoronto DOT ca>
|
User-Agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8
|
MIME-Version: | 1.0
|
To: | cygwin AT cygwin DOT com
|
Subject: | Re: 64-bit emacs crashes a lot
|
References: | <51F3151D DOT 7040000 AT cs DOT utoronto DOT ca> <51F33565 DOT 1090406 AT cornell DOT edu> <51F33F52 DOT 4060405 AT cs DOT utoronto DOT ca> <51FB1D9E DOT 5090102 AT cs DOT utoronto DOT ca> <20130802080211 DOT GA18054 AT calimero DOT vinschen DOT de> <51FB9228 DOT 2020309 AT cornell DOT edu> <51FBA100 DOT 90005 AT cs DOT utoronto DOT ca> <51FD5462 DOT 5020400 AT cs DOT utoronto DOT ca> <51FFBDFF DOT 7040501 AT cornell DOT edu> <51FFC4F2 DOT 8080909 AT cs DOT utoronto DOT ca> <5203D89E DOT 6030801 AT cornell DOT edu> <5203DCCA DOT 1010105 AT cs DOT utoronto DOT ca> <5205B364 DOT 8090007 AT cs DOT utoronto DOT ca> <52064730 DOT 50404 AT cornell DOT edu> <52065B3C DOT 6060104 AT cs DOT utoronto DOT ca> <52067FDD DOT 4000708 AT cornell DOT edu>
|
In-Reply-To: | <52067FDD.4000708@cornell.edu>
|
On 10/08/2013 2:01 PM, Ken Brown wrote:
> On 8/10/2013 11:24 AM, Ryan Johnson wrote:
>> On 10/08/2013 9:59 AM, Ken Brown wrote:
>>> On 8/9/2013 11:28 PM, Ryan Johnson wrote:
>>>> On 08/08/2013 2:00 PM, Ryan Johnson wrote:
>>>>> On 08/08/2013 1:42 PM, Ken Brown wrote:
>>>>>> On 8/5/2013 11:29 AM, Ryan Johnson wrote:
>>>>>>> On 05/08/2013 11:00 AM, Ken Brown wrote:
>>>>>>>> On 8/3/2013 3:05 PM, Ryan Johnson wrote:
>>>>>>>>> On 02/08/2013 8:07 AM, Ryan Johnson wrote:
>>>>>>>>>> On 02/08/2013 7:04 AM, Ken Brown wrote:
>>>>>>>>>>> On 8/2/2013 4:02 AM, Corinna Vinschen wrote:
>>>>>>>>>>>> On Aug 1 22:46, Ryan Johnson wrote:
>>>>>>>>>>>>> Here's a new one... I started a compilation, but before it
>>>>>>>>>>>>> actually
>>>>>>>>>>>>> invoked the command it started pegging the CPU. After
>>>>>>>>>>>>> ^G^G^G, it
>>>>>>>>>>>>> crashed with the following:
>>>>>>>>>>>>>> Auto-save? (y or n) y
>>>>>>>>>>>>>> 0 [main] emacs 5076 C:\cygwin64\bin\emacs-nox.exe: ***
>>>>>>>>>>>>>> fatal
>>>>>>>>>>>>>> error - Internal error: TP_NUM_W_BUFS too small 2268032
>>>>>>>>>>>>>> >= 10.
>>>>>>>>>>>>
>>>>>>>>>>>> That looks like a memory overwrite. 2268032 is 0x229b80,
>>>>>>>>>>>> which
>>>>>>>>>>>> looks
>>>>>>>>>>>> suspiciously like a stack address. And the overwritten
>>>>>>>>>>>> value is
>>>>>>>>>>>> on the
>>>>>>>>>>>> stack, too, well within the cygwin TLS area. If *this* value
>>>>>>>>>>>> gets
>>>>>>>>>>>> overwritten, the TLS is probbaly totally hosed at this point.
>>>>>>>>>>>> There's
>>>>>>>>>>>> just no way to infer the culprit from this limited info.
>>>>>>>>>>>
>>>>>>>>>>> Could this be BLODA? Ryan, I noticed that you wrote in a
>>>>>>>>>>> different
>>>>>>>>>>> thread, "I recently migrated to 64-bit cygwin...and so far
>>>>>>>>>>> have not
>>>>>>>>>>> had to disable Windows Defender; the latter was a recurring
>>>>>>>>>>> source of
>>>>>>>>>>> trouble for my previous 32-bit cygwin install on Win7/64."
>>>>>>>>>> This would be a whole new level of nasty from a BLODA... I
>>>>>>>>>> thought
>>>>>>>>>> they only interfered with fork()?
>>>>>>>>>>
>>>>>>>>>> However, this *is* Windows Defender we're talking about...
>>>>>>>>>> service
>>>>>>>>>> disabled and all cygwin processes restarted. I'll let you know
>>>>>>>>>> in a
>>>>>>>>>> day or so if the crashes go away.
>>>>>>>>> Rats. I just had another crash, the "Fatal error 6" variety.
>>>>>>>>> Windows
>>>>>>>>> Defender has not turned itself back on (it's been known to do
>>>>>>>>> that), and
>>>>>>>>> a scan of the BLODA list didn't match anything else on my system.
>>>>>>>>>
>>>>>>>>> So I don't think it's BLODA...
>>>>>>>>>
>>>>>>>>> Ideas?
>>>>>>>>
>>>>>>>> Not really, other than the obvious: (a) Find a reproducible way of
>>>>>>>> making emacs-nox crash. (b) Catch the crash in gdb by setting a
>>>>>>>> suitable break point.
>>>> Got one! Looks like a stack overflow somewhere in the garbage
>>>> collector:
>>>>
>>>> Program received signal SIGSEGV, Segmentation fault.
>>>> [Switching to Thread 5316.0x1af4]
>>>> 0x00000001004df44a in mark_object (arg=<optimized out>)
>>>> at /usr/src/debug/emacs-24.3-4/src/alloc.c:5903
>>>> 5903 if (CONS_MARKED_P (ptr))
>>>> (gdb) bt
>>>> #0 0x00000001004df44a in mark_object (arg=<optimized out>)
>>>> at /usr/src/debug/emacs-24.3-4/src/alloc.c:5903
>>>> #1 0x00000001004df66e in mark_object (arg=<optimized out>)
>>>> at /usr/src/debug/emacs-24.3-4/src/alloc.c:5914
>>>> #2 0x00000001004df593 in mark_object (arg=<optimized out>)
>>>> at /usr/src/debug/emacs-24.3-4/src/alloc.c:5809
>>>> #3 0x00000001004df66e in mark_object (arg=<optimized out>)
>>>> at /usr/src/debug/emacs-24.3-4/src/alloc.c:5914
>>>> #4 0x00000001004df66e in mark_object (arg=<optimized out>)
>>>> at /usr/src/debug/emacs-24.3-4/src/alloc.c:5914
>>>> #5 0x00000001004df585 in mark_object (arg=<optimized out>)
>>>> at /usr/src/debug/emacs-24.3-4/src/alloc.c:5808
>>>> #6 0x00000001004dfa4e in mark_vectorlike (
>>>> ptr=0x100f66f28 <bss_sbrk_buffer+6955080>)
>>>> at /usr/src/debug/emacs-24.3-4/src/alloc.c:5501
>>>> ... snip ...
>>>> #2606 0x00000001004dfaf4 in mark_buffer (buffer=<optimized out>)
>>>> at /usr/src/debug/emacs-24.3-4/src/alloc.c:5552
>>>> #2607 0x00000001004dff2c in Fgarbage_collect ()
>>>> at /usr/src/debug/emacs-24.3-4/src/alloc.c:5181
>>>> #2608 0x0000000000000000 in ?? ()
>>>
>>> I don't know whether 2608 stack frames is unusual or not. Is this
>>> enough to cause a stack overflow?
>> I don't know the answer to that for emacs, but in general that's an
>> exceedingly deep stack that would normally indicate some sort of
>> infinite recursion. Would you actually expect an object tree in emacs to
>> be 2000+ pointers deep? No plausible non-bug scenarios leap to mind
>> right off...
>
> I'd be very surprised if there were a bug in the garbage collection
> routine that's causing this. If there were, I'd expect to see lots of
> people reporting this. Could there be some memory corruption that
> creeps in when you suspend/resume emacs? You did say that the crashes
> are less frequent since you deactivated Windows Defender, so I'm not
> sure you can rule out BLODA.
>
> By the way, are your crashes always related to suspending and resuming
> emacs? I don't recall that you said that before, but you keep
> mentioning ^Z. Do you still get crashes if you never suspend emacs?
> You could also try one of the GUI versions of emacs to see if you get
> crashes. "Suspending" in that case simply iconifies the frame.
>
>>>
>>>> I have the full backtrace saved to file, let me know if that would be
>>>> useful (there wasn't anything obvious that I could see, just more
>>>> of the
>>>> same). Meanwhile, I verified that none of the addresses printed is
>>>> repeated, so it doesn't seem to be due to an obvious cycle in the
>>>> object
>>>> graph.
>>>
>>> From what you've shown, it appears that most of the addresses have
>>> been optimized out. I think you would need an unoptimized build in
>>> order to check that, wouldn't you?
>> Probably, yes. That's why I said no "obvious" cycles -- at least the 400
>> pointers that are shown don't show a problem.
>>
>>>
>>>> The crash happened when I foregrounded a stopped emacs. I tried
>>>> playing
>>>> around with various breakpoints while repeatedly sending ^Z, but no
>>>> luck
>>>> repeating the "feat" yet.
>>>>
>>>> Ideas?
>>>
>>> Can you trigger the bug by calling garbage collection manually (M-x
>>> garbage-collect)? What happens if you put a breakpoint at
>>> Fgarbage_collect and step through it? (Again, you might need an
>>> unoptimized build before that will be useful.)
>> I tried breaking on Fgarbage_collect and hitting ^Z no love. I also
>> tried setting a breakpoint on one of those other internal functions,
>> with an ignore count intended to trigger it deep in a GC cycle. It
>> triggered some tens of frames deep and ^Z there didn't cause trouble
>> either. I wonder if the GC cycle just happened to coincide with
>> reactivating emacs (perhaps triggered by some internal timeout that
>> elapsed while it was stopped?)
>>
>>>
>>> There are lots of lisp variables that can be used to control garbage
>>> collection and get information about it. See the section on garbage
>>> collection in the elisp manual. For example, you could try
>>> customizing garbage-collection-messages. Or you could play with
>>> gc-cons-threshold.
>> I didn't see anything glaringly useful there... the messages just
>> announce a GC run, which gdb can catch just fine. There doesn't seem to
>> be any way of tracking how deep an object tree emacs traversed, or how
>> many objects were freed.
>
> Sorry, I misread what the message would be. I should have said that
> you could look directly at the output from garbage-collect, which you
> can see if you evaluate (garbage-collect) in the *scratch* buffer.
> But, as I said above, I'm not sure that garbage collection is the
> underlying problem here.
Agree it's probably not GC... GC would just tend to trip over any bad
pointers that were lurking around...
After a rash of crashes where I either forgot to attach gdb or forgot to
set appropriate breakpoints, I finally managed to catch the stack trace
below. It occurred during M-x compile, while emacs parsed the
compilation's rather copious output, which is by far the most common
type of crash I've been getting lately. I have no idea how to interpret
the backtrace, though.
What should I try next? I assume I'll need a debug-compiled emacs so the
backtrace isn't garbage? If so, (a) what is the most straightforward way
to compile emacs-nox that way and (b) what would I be looking for if I
encountered the below stack trace in a debug build?
Thanks,
Ryan
Breakpoint 2, 0x000000010055d190 in kill ()
(gdb) bt
#0 0x000000010055d190 in kill ()
#1 0x000000010053702e in process_send_signal
(process=process AT entry=25781889629, signo=signo AT entry=2,
current_group=<optimized out>, nomsg=nomsg AT entry=0) at
/usr/src/debug/emacs-24.3-4/src/process.c:5948
#2 0x0000000100537198 in Finterrupt_process (process=25781889629,
current_group=<optimized out>) at
/usr/src/debug/emacs-24.3-4/src/process.c:5966
#3 0x00000001004f7761 in Ffuncall (nargs=<optimized out>,
args=<optimized out>) at /usr/src/debug/emacs-24.3-4/src/eval.c:2781
#4 0x000000010052b5ed in exec_byte_code (bytestr=4294962344,
vector=2268896, maxdepth=2, args_template=4303595040, nargs=4304157760,
args=0x100902032 <bss_sbrk_buffer+250194>)
at /usr/src/debug/emacs-24.3-4/src/bytecode.c:900
#5 0x00000001004f7293 in funcall_lambda (fun=25778101277,
nargs=nargs AT entry=0, arg_vector=arg_vector AT entry=0x22a188) at
/usr/src/debug/emacs-24.3-4/src/eval.c:3010
#6 0x00000001004f75cb in Ffuncall (nargs=nargs AT entry=1,
args=args AT entry=0x22a180) at /usr/src/debug/emacs-24.3-4/src/eval.c:2839
#7 0x00000001004f8bef in apply1 (fn=25778613730, fn AT entry=4304161216,
arg=arg AT entry=4304412722) at /usr/src/debug/emacs-24.3-4/src/eval.c:2539
#8 0x00000001004f3567 in Fcall_interactively (function=4304161216,
record_flag=4304412722, keys=4299711881) at
/usr/src/debug/emacs-24.3-4/src/callint.c:377
#9 0x00000001004f7752 in Ffuncall (nargs=nargs AT entry=4,
args=args AT entry=0x22a3b0) at /usr/src/debug/emacs-24.3-4/src/eval.c:2785
#10 0x00000001004f91b7 in call3 (fn=<optimized out>, arg1=<optimized
out>, arg2=<optimized out>, arg3=<optimized out>) at
/usr/src/debug/emacs-24.3-4/src/eval.c:2603
#11 0x00000001004883cd in Fcommand_execute (cmd=<optimized out>,
record_flag=<optimized out>, keys=<optimized out>, special=<optimized
out>) at /usr/src/debug/emacs-24.3-4/src/keyboard.c:10241
#12 0x0000000100494ae8 in command_loop_1 () at
/usr/src/debug/emacs-24.3-4/src/keyboard.c:1587
#13 0x00000001004f5c2e in internal_condition_case
(bfun=bfun AT entry=0x100494740 <command_loop_1>, handlers=4304470642,
hfun=hfun AT entry=0x10048ae40 <cmd_error>) at
/usr/src/debug/emacs-24.3-4/src/eval.c:1289
#14 0x000000010048630a in command_loop_2
(ignore=ignore AT entry=4304412722) at
/usr/src/debug/emacs-24.3-4/src/keyboard.c:1168
#15 0x00000001004f5aef in internal_catch (tag=<optimized out>,
func=func AT entry=0x1004862e0 <command_loop_2>, arg=4304412722) at
/usr/src/debug/emacs-24.3-4/src/eval.c:1060
#16 0x000000010048a914 in command_loop () at
/usr/src/debug/emacs-24.3-4/src/keyboard.c:1147
#17 recursive_edit_1 () at /usr/src/debug/emacs-24.3-4/src/keyboard.c:779
#18 0x000000010048ac47 in Frecursive_edit () at
/usr/src/debug/emacs-24.3-4/src/keyboard.c:843
#19 0x000000010055e8ef in main (argc=<optimized out>, argv=<optimized
out>) at /usr/src/debug/emacs-24.3-4/src/emacs.c:1537
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
- Raw text -