Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Date: Mon, 7 Feb 2005 15:44:12 -0500 From: Christopher Faylor To: cygwin AT cygwin DOT com Subject: Re: hyperthreading fix, try #1 Message-ID: <20050207204412.GA13789@trixie.casa.cgf.cx> Reply-To: cygwin AT cygwin DOT com References: <20050206202356 DOT GH13306 AT trixie DOT casa DOT cgf DOT cx> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i On Mon, Feb 07, 2005 at 08:17:52AM +0100, Volker Bandke wrote: >Which system configuration did you use to recreate the problem? I got enough donations to purchase the following: Motherboard: ASUS P4P800SE Memory: 1G CPU: CPU P4/3.0EGHz 800M 478P/1MB HT RT HD: Samsung 120GB Case: ASPIRE XINFINITY BL 350W RTL I purchased this from Newegg. I love that company. I put the system together in one night, turned it on, and it worked. All of the lights came on correctly, the system booted with a CD, and transferring data from my old system proceeded without a hitch, thanks to my knoppix CD -- love that knoppix, too. The one thing that took me forever to fix was getting XP running. Somehow my XP CD got cracked with a big chunk taken out of it, so I had to get a new CD, and I ended up transferring data from my old system multiple times as I attempted to install the new CD without overwriting all of my existing data. The way I usually do this is to copy raw partitions over, since my windows box is multi-boot and represents years of work. Sometimes the OS figures out how to reconfigure itself, sometimes it needs a nudge. In this case, it needed to be whacked with a large branch. I couldn't get W2K working but I've held off further investigations in that for another time. >also, can you describe (in _short_ terms) the cause of the error? Cygwin has a problem because normal pipe I/O on windows is not interruptible (generically speaking - you could kludge it on NT). So, to work around this problem, it starts up pipe i/o in a thread and kills the thread when a signal comes in. It's a sledge hammer approach to interrupting pipe I/O. The pipe thread uses a synchronization event to tell the initiating reader when the pipe is all set, has grabbed its arguments and is ready to go. This event was also used to tell the reader that there was a successful read. Previous to my fix, cygwin did not reliably wait for both events to happen so, after the first read on a pipe, it would become out of sync. This would present a problem on any kind of SMP-like system but it wouldn't be as noticeable on a non-SMP system. Once I ran the test case twenty times or so, I went back and looked at the code I'd previously stared at for hours and saw a few synchronization issues. For once the back trace from gdb showed that something was clearly amiss. So, the fix was to try much harder to ensure that we've correctly waited for notification events, even in the scenario when cygwin thinks it has to terminate a thread due to the arrival of a signal. It is possible that the read has completed in that case and cygwin should not throw the data away since the read really *wasn't* terminated by a signal. Unfortunately, there is still a race here. I have an idea about how to fix the race but it would introduce a destabilizing change that I'd rather not chance before 1.5.13 is released. Given that I can't reproduce the problem with the test script anymore, I think I'll release cygwin with this change plus any other potential fixes required to handle the "make -j" problem. cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/