Mail Archives: cygwin/2004/01/20/06:28:38
--------------070903060102040708090002
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
> > > > On Mon, Jan 19, 2004 at 10:08:39PM +0100, H. Henning Schmidt wrote:
> > > > >I found a potential deadlock while writing to a serial port (e.g.
> > > > >/dev/com1) that has been opened as O_RDWR. The deadlock occurs
from time
> > > > >to time (not sure about exact conditions) when I write to that
port,
> > > > >while there is data coming in (e.g. from an external device)
and I do
> > > > >not read away that data fast enough from the port.
> > > > >
> > > > >I did provide a test case a while ago in
> > > > >http://sources.redhat.com/ml/cygwin/2003-03/msg01529.html. I
digged into
> > > > >the issue some more now and found that the executing thread got
> > > > >sometimes deadlocked in fhandler_serial::raw_write(). It
basically ends
> > > > >up in a for(;;) loop and just never hits the break;
> > > > >
> > > Exactly. When the input buffer overflows, all serial
communications cease
> > > and calls exit with ERROR_OPERATION_ABORTED. If you only call
write, then
> > > the ClearCommError() necessary to start things up again is never
called,
> > > and you stick in that infinite loop.
> >
> > Ok, I will be happy to learn what the proper fix looks like in your
opinion.
> >
> When I can get to that code, I'll post it.
>
> > However, I have a hard time accepting the fact that I cannot write
OUT the
> > serial port because my INput buffer overflows. At least when I have
switched
> > off all kinds of flow control (which I have), then I consider these
to be two
> > independent streams that happen to share one filedesc. Or should I
perhaps
> > open two independent filedescriptors in this case, one write-only, one
> > read-only?
> >
> > Are you saying that the underlying Win-API enforces what you state
above, or does
> > this really make sense in some way and I just didn't get it yet?
Your statement
> > sounds to me like I am forced to keep a second thread around, and
if only to get
> > me out of that trap, once I get hung there ... does not sound too
elegant to me ...
> >
> Yes. I was stating the limitations of the Win-API.
>
> > > > >The applied patch adds a safety exit to that for(;;) loop.
> > > > >This fixes the testcase referenced above.
> > > > >
> > > Yuck! No, this is not the proper fix.
> > >
> > > > >This might not be the last problem lingering in the serial
access code
> > > > >(there are some FIXME tokens still around ...), but it is
definitely an
> > > > >improvement for me. I thought I'd share that with you.
> > > >
> > > > Can you convince me that this isn't just a band-aid? I don't
understand
> > > > why cygwin *shouldn't* hang in a situation like this. There are
> > > > certainly similar situations where this happens on linux.
> > > >
> > > > Perhaps we need a low_priority_sleep (10) in the loop in that
situation
> > > > or something.
> > > >
> > > No. I have a partial patch for the above, but I am in the process of
> > > getting a new Windows box and shuffling all my data. I'll try to
submit
> > > it when things settle if no one beats me to it.
> >
> > who am I to beat you ...
> > but be assured that I'll be waiting eagerly for this patch.
> > This issue has been kicking me for a long time (letting my app hang
up every once in
> > a while), and I really want get this fixed. So now that I know that
there is a better
> > way to do it ... that's what I'm longing for then :-)
> >
> While you're waiting, why don't you try what I said. Put a
ClearCommError
> call in the switch for the error stated. I bet you get 90+% of what you
> want.
ok. did this. it works. thanks a lot for the tip.
With this change, my deadlock-prevention (which I still left in there,
paranoyed as I am) just never comes into action.
I attach my modified patch to this message for reference (to be applied
to the original 1.5.5-1 sources)
;Henning
--------------070903060102040708090002
Content-Type: text/plain;
name="patch.cygwin-1.5.5-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="patch.cygwin-1.5.5-1"
diff -cr cygwin-1.5.5-1.orig/winsup/cygwin/fhandler_serial.cc cygwin-1.5.5-1/winsup/cygwin/fhandler_serial.cc
*** cygwin-1.5.5-1.orig/winsup/cygwin/fhandler_serial.cc Sat Jun 21 02:12:35 2003
--- cygwin-1.5.5-1/winsup/cygwin/fhandler_serial.cc Tue Jan 20 10:57:31 2004
***************
*** 153,174 ****
int
fhandler_serial::raw_write (const void *ptr, size_t len)
{
! DWORD bytes_written;
OVERLAPPED write_status;
memset (&write_status, 0, sizeof (write_status));
write_status.hEvent = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
ProtectHandle (write_status.hEvent);
! for (;;)
{
if (WriteFile (get_handle (), ptr, len, &bytes_written, &write_status))
break;
switch (GetLastError ())
{
! case ERROR_OPERATION_ABORTED:
! continue;
case ERROR_IO_PENDING:
break;
default:
--- 153,179 ----
int
fhandler_serial::raw_write (const void *ptr, size_t len)
{
! DWORD bytes_written = 0;
OVERLAPPED write_status;
memset (&write_status, 0, sizeof (write_status));
write_status.hEvent = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL);
ProtectHandle (write_status.hEvent);
! for (int prevent_deadlock=0 ; prevent_deadlock<10 ; prevent_deadlock++)
{
if (WriteFile (get_handle (), ptr, len, &bytes_written, &write_status))
break;
switch (GetLastError ())
{
! case ERROR_OPERATION_ABORTED: {
! DWORD ev;
! COMSTAT st;
! if (!ClearCommError (get_handle (), &ev, &st))
! __seterrno ();
! bytes_written = 0;
! continue; }
case ERROR_IO_PENDING:
break;
default:
--------------070903060102040708090002
Content-Type: text/plain; charset=us-ascii
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
--------------070903060102040708090002--
- Raw text -