Mailing-List: contact cygwin-developers-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-developers-owner AT cygwin DOT com Delivered-To: mailing list cygwin-developers AT cygwin DOT com Message-ID: <00af01c2341b$b6138890$6132bc3e@BABEL> From: "Conrad Scott" To: Subject: Signals and the such-like Date: Thu, 25 Jul 2002 21:41:53 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 [Sorry: Long email] About a week ago I discovered a race condition in the UNIX domain socket emulation in cygwin. I've got a patch for this that works (and fixes several other small problems) bar one *minor* issue and since I'm out of ideas, I hope someone else out there has got some advice for me (even if it's only "don't do that!"). Here goes. I've put together a new UNIX domain handshake protocol, but somewhere it's got to pause long enough for the server to pick up the client's half of the protocol, since with a socket a client can get connection, write some data and close the socket before the server has accepted the connection (the connection's just sitting on the pending queue). So, I've got a piece of code in the fhandler_socket::close method that only closes the client's secret event once the client has received the server's okay signal *or* a (Unix) signal arrives *or* the server closes its end of the connection (i.e. the server exits w/o ever accepting the connection). This is all fine and dandy except for two situations: if the client receives an unhandled signal that should cause it to die *or* if the client exits w/o closing the socket. At this point, if the server is blocked itself and not accepting the connection, the client will not exit and can't be ctrl-c'd either. The problems in the two situations are caused by the same issue: *) If the client receives an unhandled signal, e.g. SIGINT, the do_exit function is called, which then calls close_all_files. But it does this w/o setting the 'signal_arrived' event, so none of the events are set that the fhandler_socket::close method is waiting on (at least, not in the particular circumstances mentioned here). *) If the client exits w/o closing the socket, again it gets stuck in fhandler_socket::close since no events are going to be raised. Alternatives (AFAICT): *) Just put a timeout in the fhandler_socket::close routine (as was effectively the case in the previous protocol). *) In do_exit, set a global flag that the close routine can pick up. There is already such a flag: exit_already in "exceptions.cc" but this is static and so inaccessible. Or is there an existing mechanism that I'm missing? *) A partial solution (and one that might be worth doing regardless of any other solution) would be to set the 'signal_arrived' event before calling the do_exit function when dying from a signal's arrival. I've tried this and it seems to cause no problems, but is only a partial solution to the problem. (Unless it's always set on exit . . . yuck?) *) It would be okay perhaps to let the client block in this way, *if* it could still be killed by a signal whilst blocked. *But* the do_exit code in "dcrt0.cc" ignores a slew of signals, so if a process does get blocked while exiting, it can't then be (easily) killed. [You can still 'kill -9' it at this point.] Has someone *) Or am I worrying too much? Don't worry about it much, bung in a timeout, it'll hardly ever happen, relax? // Conrad