Mailing-List: contact cygwin-help AT sourceware DOT cygnus DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT sources DOT redhat DOT com Delivered-To: mailing list cygwin AT sources DOT redhat DOT com Message-ID: <3BE313C1.9060900@ece.gatech.edu> Date: Fri, 02 Nov 2001 16:44:33 -0500 From: Charles Wilson User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.2) Gecko/20010726 Netscape6/6.1 X-Accept-Language: en-us MIME-Version: 1.0 To: Robert Collins CC: cygwin AT cygwin DOT com Subject: Re: setup test #5 - RC1 References: <1004690523 DOT 6940 DOT 18 DOT camel AT lifelesswks> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit I got the BSOD/reboot on a *different* W2K machine -- one that has not exhibited other instability. That shoots down my "it was an unstable machine anyway" excuse -- the problem may actually be in setup.exe. Here's what I was able to discover: The error log says: The computer has rebooted from a bugcheck. The bugcheck was: 0x0000000a (0x00000014, 0x00000002, 0x00000000, 0x80454c5d). Microsoft Windows 2000 [v15.2195]. A dump was saved in: E:\WINNT\Minidump\Mini110201-01.dmp. 0x0000000a is an access violation type of error -- bad pointers, invalid accesses, etc -- but I'd imagine this sort of thing can only cause a BSOD if it occurs in kernel mode. Perhaps during a system call, but setup passes bad parameters? I have the minidump, but not a full memory dump -- and since I don't have the kernel symbols, a w2k kernel debugger, etc, I don't think the minidump will be of much use. doing a postmortem on my system, I find that the error occurred at a particular point: I was upgrading the following packages using setup-20011002-5.exe: cygrunsrv-0.94-1.tar.bz2 --> cygrunsrv-0.94-2.tar.bz2 gdb-20010428-1.tar.gz --> gdb-20010428-3.tar.bz2 newlib-man-20001118.tar.gz --> newlib-man-20001118-1.tar.bz2 vim-6.0.11-1.tar.bz2 --> vim-6.0.46-1.tar.bz2 Here is the sequence of events (during the same setup.exe run) 1. cygrunsrv uninstalled 2. cygrunsrv installed 3. gdb uninstalled 4. gdb installed 5. newlib-man uninstalled 6. newlib-man installed 7. vim uninstalled 8. vim partially installed (This was interrupted by the bluescreen) All but the last few files in the archive were installed ok. The last file installed was usr/share/vim/vim60/tutor/tutor (and it was installed completely, not partially). The missing files (e.g. setup didn't get a chance to install these at all) were: usr/share/vim/vim60/tutor/tutor.es usr/share/vim/vim60/tutor/tutor.fr usr/share/vim/vim60/tutor/tutor.info usr/share/vim/vim60/tutor/tutor.it usr/share/vim/vim60/tutor/tutor.ja.euc usr/share/vim/vim60/tutor/tutor.ja.sjis usr/share/vim/vim60/tutor/tutor.vim usr/share/vim/vim60/vimrc_example.vim Of course, since the BSOD occured prior to setup finishing, my /etc/installed.db was not updated to reflect the installation of ANY of these four packages. So, the error occured *after* completely installing a file (..tutor/tutor) but before beginning to write the next file to disk (..tutor/tutor.es). This smells like a pointer error in the decompression/untar functions. I've seen this BSOD happen before (on the "unstable" machine, so I didn't blame setup) but only on BIG tarballs, now that I think about it. The strange thing is, it was rarely repeatable -- I'd just "try again" and it would work. (Another reason I didn't blame setup.exe). Perhaps a pointer var being stored in a short int? I dunno, I'm just brainstorming. Data: vim-6.0.46.tar (after bunzip2'ing) is 9,226,240 bytes long. No integer variable type overflows around 9M, does it? Perhaps the error depends on where in memory the data buffer for the in-memory-uncompressed .tar image is based.... I've got personal stuff this evening, but I'll try to hunt this one down this weekend if nobody beats me to it. --Chuck -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/