X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:subject:in-reply-to:to:cc:reply-to :message-id:references; q=dns; s=default; b=WuLvLpV2eE5XZk245g1h HO450uBQlXmulh/W6B1XX7is8GMQFNKpmcXGsCrm6oN5t1Ut0XYX+unaoInqfKri 7drGHT+9UOhoFKp7BsZ4tBAgSTy1ocwUqQYAWqPpfiGDJ7NLxI52TvdrxLJ7xGWe szrQ2/W1Dy2IaNGc06y918o= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:subject:in-reply-to:to:cc:reply-to :message-id:references; s=default; bh=ViPUAWuRkMf6sRydQ3Ze0kd2lA g=; b=mlsmB4Ly0a29p+Md88PLDRxBAN9uu26mxdp6tCQ7uIDnSiwterlhOsjfKf Vp7m/1HKF7ouHFQlBytT0azKvMXtrmrEnj8GPCPGeqyhRNlz39sRecZnPV04XD0h Wqh+bq20nynWh8TCCr0BCCBYxO4op/NfVBclvFN/kRY+s4F1w= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_50,KHOP_THREADED,RCVD_IN_DNSWL_NONE,RCVD_IN_HOSTKARMA_NO,SPF_SOFTFAIL autolearn=no version=3.3.2 Date: Fri, 16 Aug 2013 11:56:48 +0300 From: Eli Zaretskii Subject: Re: 64-bit emacs crashes a lot In-reply-to: <520D4036.8010303@cs.utoronto.ca> To: Ryan Johnson Cc: cygwin AT cygwin DOT com Reply-to: Eli Zaretskii Message-id: <8361v6nhdb.fsf@gnu.org> References: <51F3151D DOT 7040000 AT cs DOT utoronto DOT ca> <51F33565 DOT 1090406 AT cornell DOT edu> <51F33F52 DOT 4060405 AT cs DOT utoronto DOT ca> <51FB1D9E DOT 5090102 AT cs DOT utoronto DOT ca> <20130802080211 DOT GA18054 AT calimero DOT vinschen DOT de> <51FB9228 DOT 2020309 AT cornell DOT edu> <51FBA100 DOT 90005 AT cs DOT utoronto DOT ca> <51FD5462 DOT 5020400 AT cs DOT utoronto DOT ca> <51FFBDFF DOT 7040501 AT cornell DOT edu> <51FFC4F2 DOT 8080909 AT cs DOT utoronto DOT ca> <5203D89E DOT 6030801 AT cornell DOT edu> <5203DCCA DOT 1010105 AT cs DOT utoronto DOT ca> <5205B364 DOT 8090007 AT cs DOT utoronto DOT ca> <52064730 DOT 50404 AT cornell DOT edu> <"52065B3C DOT 6060104 AT cs DOT utoronto <520CCA41.3000107"@cs.utoronto.ca> <520D089A DOT 1020806 AT cornell DOT edu> <83ioz6op5v DOT fsf AT gnu DOT org> <520D4036 DOT 8010303 AT cs DOT utoronto DOT ca> I'm not subscribed to this list, so if you want me to reply, please CC me explicitly. Besides, this discussion should be moved to emacs-devel AT gnu DOT org, since I don't see anything Cygwin specific here at this point. > Date: Thu, 15 Aug 2013 16:55:18 -0400 > From: Ryan Johnson > > On 15/08/2013 1:10 PM, Eli Zaretskii wrote: > >> Date: Thu, 15 Aug 2013 12:58:02 -0400 > >> From: Ken Brown > >> CC: Eli Zaretskii > >> > >> Eli is the expert on bidi.c (he wrote it). He can probably tell you > >> whether you've really bumped into an emacs bug here. > > There's nothing wrong with bidi.c here, it just aborts because it is > > handed an invalid character codepoint. It would have been useful to > > see the value of that character. > I guess I would just consider crashing to be overkill for a bad byte on > the input stream... It's not a crash, it's a deliberate abort. Any invalid codepoint at such low level of the Emacs display engine means only one thing: a bug, and a grave one at that. Such bugs must be flagged prominently and unequivocally, prompting users to report them. We could in principle "recover" by substituting some other character, but such recovery would only sweep a grave problem under the carpet. Since Emacs isn't a safety-critical program, and auto-saves your edits before it commits suicide, such recovery feature is deemed inappropriate, and detrimental to the general quality of Emacs code in the long run. > and in any case, if 5-byte UTF-8 is illegal, and > worth dying for, wouldn't it make sense to die right away rather than > processing it so something else can croak down the road? See above: yes, it's worth dying for, because I'm quite sure this is a sign of a very serious trouble in the session anyway. Why does it matter for you, as a user, whether we abort here or "down the road"? The principle is to die as soon as possible, because in many cases this allows to identify the culprit faster and easier. IOW, dying sooner and faster helps the Emacs maintainers to find and fix problems without any real effect on the users. > > Anyway, I generally agree that this is probably some memory > > corruption, as I'm guessing that the text in the window was all ASCII > > in this case, so any character codepoint beyond 127 is not to be > > expected. > I set a breakpoint there, since I thought it was guaranteed to lead to a > crash if it ever ran, but it turns out that's not true. Invoking M-x > compile triggers the breakpoint twice in a row with the following > (valid!) 5-byte UTF-8: > > 111110XX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX > 11111000 10001111 10111111 10111101 10111111 > > The value is always the same, and corresponds to the code point > U+3FFF7F, FWIW. If the value is positive and below 3FFFFF, then the abort could not have happened. Therefore, I believe that the optimized build lies to GDB, and the actual value is not what you see in GDB. Alternatively (and that is also a known effect of debugging an optimized build), the abort happened not where you think, but rather a few lines below: default_type = (bidi_type_t) XINT (CHAR_TABLE_REF (bidi_type_table, ch)); /* Every valid character code, even those that are unassigned by the UCD, have some bidi-class property, according to DerivedBidiClass.txt file. Therefore, if we ever get UNKNOWN_BT (= zero) code from CHAR_TABLE_REF, that's a bug. */ if (default_type == UNKNOWN_BT) emacs_abort (); <<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Optimized code frequently emits only one call to emacs_abort, and converts the other calls to a jump to the locus of that single call. I really suggest to get an unoptimized build and debug that instead. Debugging optimized builds, even with GCC 4.8, is a hard and frustrating task. In particular, most of the backtraces you posted don't make any sense at all -- a frequent problem in optimized builds. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple