Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <026401c2f409$0bd6e170$cf6d86d9@ellixia> Reply-To: "Elfyn McBratney" From: "Elfyn McBratney" To: "cygwin" , "Tim Allen" References: <200303261132 DOT 20664 DOT tim AT proximity DOT com DOT au> Subject: Re: Problems with non-blocking I/O Date: Thu, 27 Mar 2003 02:31:59 -0000 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300 > I guess cygwin doesn't get a lot of testing with non-blocking I/O. We're > having lots of problems. Using version 1.3.14, we find it barely usable but > problematic and unreliable. With versions 1.3.20 and 1.3.21, it's quite > unusable. The specific problems are, for 1.3.14: > > 1. selecting for writing on a non-blocking TCP socket will _always_ report > selection, even when a write to that socket would block > > 2. sockets get closed for no apparent reason. This seems particularly likely > after any process has been away from its select loop for a tenth of a second > or two (either it's busy elsewhere, or the scheduler doesn't give it the > processor because other processes are busy). Symptoms are "Connection reset > by peer" errors (when the peer process appears to be perfectly happy to keep > talking) and sometimes a EBADF error (not preceded by any other error, ie the > socket has simply ceased to exist without warning). > > For 1.3.20 and 1.3.21, we find that non-blocking reads also fail, with the > select always reporting selection on read, even when it should block, and, > much worse than the case for 1.3.14, the read does not block but instead > manufactures random data (presumably copying from some buffer or other). > > I'm working on making simple test cases for this. I have one that demonstrates > the first problem, which I shall attach here. I'll persist with making test > cases for the other problems (I need to strip out irrelevant stuff from the > app) and shall post them when I can reproduce the problems easily. > > The attached source files are for a pair of programs, a client and a server. > The server accepts connections on port 8888 and copies any data it receives > back to the same socket. The client connects to that port on INADDR_LOOPBACK. > It takes two file names as command line args, reads the first file, sends it > to the socket, then writes whatever comes back from the socket to the file > given as the second arg. Doing a diff on the two files after both programs > complete is a test that everything worked. The bigger the file, the more > stringent the test; I've been testing with files in the tens to hundreds of > megabytes range. > > Both programs produce copious output on stdout to tell you what they are > doing. When run on linux, the programs run very quickly, with no observed > problems at all. On cygwin 1.3.14, on Windows 2000, you can see that the > server side in particular spins through select, reporting EWOULDBLOCK all the > time when selected for write. If you pause (eg ctrl-S) the client, you can > see it even more clearly. The server should (and on linux does) itself pause > in that situation, waiting to be able to write to the socket. On cygwin it > instead keeps going, constantly raising select conditions and constantly > finding that it would block on the write, doing a busy-wait. A > single-processor box illustrates the problem best, as with two processors the > busy-wait doesn't look as bad. Tim, I tried your testcase on files ranging from ~50MG to ~1.5GB (and a few 100k files) and of every `diff' of the original file against the output file were the same eg., no output from `diff'. Now I have one WAG (perhaps not so WA), when you say you tried files that are "tens of hundreds of megabytes" do you mean files larger than ~2GB (more than one ten :-)? If so then that might be your problem. Cygwin doesn't yet (there are spanners in the works) support files larger than 2GB. Now I'm no socket expert (sure, I can do the basic echo server ;-) so perhaps someone could confirm/deny this? Regards, Elfyn McBratney elfyn AT exposure DOT org DOT uk www.exposure.org.uk -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/