Date: Sat, 25 Aug 2001 15:12:26 +0300 From: "Eli Zaretskii" Sender: halo1 AT zahav DOT net DOT il To: "Andrew Cottrell" Message-Id: <1438-Sat25Aug2001151225+0300-eliz@is.elta.co.il> X-Mailer: Emacs 20.6 (via feedmail 8.3.emacs20_6 I) and Blat ver 1.8.9 CC: djgpp-workers AT delorie DOT com, snowball3 AT bigfoot DOT com In-reply-to: <003701c12d40$a4d1e320$0a02a8c0@acceleron> (acottrel@ihug.com.au) Subject: Re: Bash 2.05 beta 23-Aug-2001 with Win 2K References: <002801c12892$0badcb80$0a02a8c0 AT acceleron> <9003-Sun19Aug2001152113+0300-eliz AT is DOT elta DOT co DOT il> <003701c12d40$a4d1e320$0a02a8c0 AT acceleron> Reply-To: djgpp-workers AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: djgpp-workers AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk > From: "Andrew Cottrell" > Date: Sat, 25 Aug 2001 18:33:39 +1000 > > Today I have had time to have a look at the problem with Bash running on Win > 2K. I did some testing and then checked for updates and found that there was > an update from the 23-Aug-2001 which I downloaded and compiled. There are > some very interesting memory corruption quirks I have found. These are > bizare to say the least. > > Mark this looks like another Win 2K specific problem. > > They quirks I found were: > > 1) If I add a debugging printf in the array_init() function then I can get > bash to start re-configuring GREP 2.4. The modified array_init() is: > static void > array_init(struct dynamic_array *array, const char *s, size_t size) > { > array->start = malloc(size + 1); > array->ptr = array->start; > array->end = array->start + size; > #if 1 > printf("%s %d , array->start = %d ,array->end = %d, malloc(%d)\n", > __FILE__, __LINE__, array->start, array->end, size+1); > #endif > if (s) > array_puts(array, s); > } > > Even more puzzling is if I remove the printf then Bash crashes if LFN=y, but > does not crash if LFN=n I don't see anything especially interesting in these observations: they just confirm that there's some memory corruption in the program. printf calls malloc, so it might change the place of the crash. Likewise, setting LFN modifies the sequence of calls to malloc, and thus has the same effect of moving the place and the exact way it crashes. The interesting question is: can you consistently reproduce the crash if you run Bash under GDB? If you can reproduce this, the next step is to find the address of a variable in malloc's internal data structures that causes the crash. Once you identify this address (by poking around in malloc's code when it crashes), the next step is to put a hardware-assisted watchpoint on that address, and then run the same crashing scenario again--this should catch the code which corrupts malloc data structures red-handed.