Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com X-Authentication-Warning: eos.vss.fsi.com: ford owned process doing -bs Date: Mon, 19 Jan 2004 15:37:39 -0600 (CST) From: Brian Ford X-X-Sender: ford AT eos To: cygwin AT cygwin DOT com Subject: Re: cygwin source-patch fixing deadlock while writing to serial port In-Reply-To: <20040119211440.GG19903@redhat.com> Message-ID: References: <400C4757 DOT 3060208 AT hhschmidt DOT de> <20040119211440 DOT GG19903 AT redhat DOT com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-IsSubscribed: yes Reply-To: cygwin AT cygwin DOT com On Mon, 19 Jan 2004, Christopher Faylor wrote: > On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote: > >I found a potential deadlock while writing to a serial port (e.g. > >/dev/com1) that has been opened as O_RDWR. The deadlock occurs from time > >to time (not sure about exact conditions) when I write to that port, > >while there is data coming in (e.g. from an external device) and I do > >not read away that data fast enough from the port. > > > >I did provide a test case a while ago in > >http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I digged into > >the issue some more now and found that the executing thread got > >sometimes deadlocked in fhandler_serial::raw_write(). It basically ends > >up in a for(;;) loop and just never hits the break; > > Exactly. When the input buffer overflows, all serial communications cease and calls exit with ERROR_OPERATION_ABORTED. If you only call write, then the ClearCommError() necessary to start things up again is never called, and you stick in that infinite loop. > >The applied patch adds a safety exit to that for(;;) loop. > >This fixes the testcase referenced above. > > Yuck! No, this is not the proper fix. > >This might not be the last problem lingering in the serial access code > >(there are some FIXME tokens still around ...), but it is definitely an > >improvement for me. I thought I'd share that with you. > > Can you convince me that this isn't just a band-aid? I don't understand > why cygwin *shouldn't* hang in a situation like this. There are > certainly similar situations where this happens on linux. > > Perhaps we need a low_priority_sleep (10) in the loop in that situation > or something. > No. I have a partial patch for the above, but I am in the process of getting a new Windows box and shuffling all my data. I'll try to submit it when things settle if no one beats me to it. -- Brian Ford Senior Realtime Software Engineer VITAL - Visual Simulation Systems FlightSafety International Phone: 314-551-8460 Fax: 314-551-8444 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/