Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <001c01c56e0d$efcf0800$1f3ca8c0@AlohaSunset.com> From: "Mark Pizzolato" To: References: <000f01c56d59$7d976d40$1f3ca8c0 AT AlohaSunset DOT com> Subject: Re: Multi Threaded programs deadlock doing simple I/O operations Date: Fri, 10 Jun 2005 15:44:16 -0700 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-IsSubscribed: yes On Thursday, June 09, 2005 at 6:12 PM, Mark Pizzolato wrote: > On Thursday, June 09, 2005 at 3:35 PM, Christopher Faylor wrote: > > On Wed, Jun 08, 2005 at 05:43:59PM -0700, Mark Pizzolato wrote: > > >There is a serious problem for multi threaded programs doing simple I/O > > >operations in cygwin (open, dup, fdopen, fclose, and close). > > > > > >The attached 81 line test program clearly demonstrates the issue (by > > >hanging and no longer consuming CPU or performing any I/O operations). > > > > Thanks for the relatively small test case. That was enough to track the > > problem down. I'm generating a new snapshot with a fix for this > > problem. > > The snapshot looks good! > > This fixes the stability problems with clamav's clamd that I've been > chasing > for a long time. Some more follow up here...I'm running with the 20050609 snapshot dll. clamav's clamd now runs better than it has ever for me on cygwin..... until "it doesn't", once it starts to run poorly it won't run cleanly again until I reboot the system (I haven't actually tried after merely exiting all processes ..) To be more specific about the "poor" behavior: - pthread_unlock_mutex fails leaving errno with a value of 90. This is in a place where there is only one path through about a dozen lines of code and the mutex is definately locked. there may have been a call to pthread_create, and a definate call to pthread_cond_signal. - once the above error happens, calls (by the same thread) to accept() fail using a file descriptor which we've been successfully using all along and only close when the program exists. so some change introduced recently (since 1.5.17-1), and possibly in 20050609 fixes the dup() issue but now mutex operations are failing in strange ways. Sorry not to have a simple isolated test case for this. The good news is that once it breaks it won't run correcfly again until a reboot. Ideas? Thanks. - Mark Pizzolato -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/