Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com From: Tim Allen Organization: Proximity Pty Ltd To: cygwin AT cygwin DOT com Subject: Problems with non-blocking I/O Date: Wed, 26 Mar 2003 11:32:20 +1100 User-Agent: KMail/1.4.1 MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="------------Boundary-00=_WTYBPR2B1G9EIHSAGJ1B" Message-Id: <200303261132.20664.tim@proximity.com.au> --------------Boundary-00=_WTYBPR2B1G9EIHSAGJ1B Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit I guess cygwin doesn't get a lot of testing with non-blocking I/O. We're having lots of problems. Using version 1.3.14, we find it barely usable but problematic and unreliable. With versions 1.3.20 and 1.3.21, it's quite unusable. The specific problems are, for 1.3.14: 1. selecting for writing on a non-blocking TCP socket will _always_ report selection, even when a write to that socket would block 2. sockets get closed for no apparent reason. This seems particularly likely after any process has been away from its select loop for a tenth of a second or two (either it's busy elsewhere, or the scheduler doesn't give it the processor because other processes are busy). Symptoms are "Connection reset by peer" errors (when the peer process appears to be perfectly happy to keep talking) and sometimes a EBADF error (not preceded by any other error, ie the socket has simply ceased to exist without warning). For 1.3.20 and 1.3.21, we find that non-blocking reads also fail, with the select always reporting selection on read, even when it should block, and, much worse than the case for 1.3.14, the read does not block but instead manufactures random data (presumably copying from some buffer or other). I'm working on making simple test cases for this. I have one that demonstrates the first problem, which I shall attach here. I'll persist with making test cases for the other problems (I need to strip out irrelevant stuff from the app) and shall post them when I can reproduce the problems easily. The attached source files are for a pair of programs, a client and a server. The server accepts connections on port 8888 and copies any data it receives back to the same socket. The client connects to that port on INADDR_LOOPBACK. It takes two file names as command line args, reads the first file, sends it to the socket, then writes whatever comes back from the socket to the file given as the second arg. Doing a diff on the two files after both programs complete is a test that everything worked. The bigger the file, the more stringent the test; I've been testing with files in the tens to hundreds of megabytes range. Both programs produce copious output on stdout to tell you what they are doing. When run on linux, the programs run very quickly, with no observed problems at all. On cygwin 1.3.14, on Windows 2000, you can see that the server side in particular spins through select, reporting EWOULDBLOCK all the time when selected for write. If you pause (eg ctrl-S) the client, you can see it even more clearly. The server should (and on linux does) itself pause in that situation, waiting to be able to write to the socket. On cygwin it instead keeps going, constantly raising select conditions and constantly finding that it would block on the write, doing a busy-wait. A single-processor box illustrates the problem best, as with two processors the busy-wait doesn't look as bad. I'll endeavour to provide more details and examples; I thought this much was worth contributing so far, as it does demonstrate one of the problems quite clearly. May I suggest it'd be worth adding a test based on this to the regression test suite? Or, forgive my ignorance, making a regression test suite if one doesn't exist, and basing one of the tests on this. In either case, you are welcome to use the supplied code to do so. Tim -- ----------------------------------------------- Tim Allen tim AT proximity DOT com DOT au Proximity Pty Ltd http://www.proximity.com.au/ http://www4.tpg.com.au/users/rita_tim/ --------------Boundary-00=_WTYBPR2B1G9EIHSAGJ1B Content-Type: text/x-c++src; charset="us-ascii"; name="plainTCPEchoClient.C" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="plainTCPEchoClient.C" // $Id: plainTCPEchoClient.C,v 1.3 2003/03/25 08:15:02 tim Exp $ #include #include #include #include #include #include #include #include #include #include #include /** Echo client for testing non-blocking TCP. Streams a file to a tcp port and writes out to another file what comes back. */ #define BUFF_SIZE 0x8000 int main(int argc, char* argv[]) { if (argc < 3) { printf("Usage:%s \n", argv[0]); exit(0); } int sock; sock = socket(PF_INET, SOCK_STREAM, 0); if (sock == -1) { printf("Socket creation failed:%s\n", strerror(errno)); exit(0); } fcntl(sock, F_SETFL, fcntl(sock, F_GETFL) | O_NONBLOCK); sockaddr_in sin; sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl(INADDR_LOOPBACK); sin.sin_port = htons(short(8888)); memset(sin.sin_zero, 0, sizeof(sin.sin_zero)); timeval patience; patience.tv_sec = 180; patience.tv_usec = 0; if (connect(sock, (sockaddr*)&sin, sizeof(sin)) == 0) { // connected immediately, no need to select } else { switch (errno) { case EINPROGRESS: // select for WRITE fd_set writeSet; FD_ZERO(&writeSet); FD_SET(sock, &writeSet); if (select(sock + 1, NULL, &writeSet, NULL, &patience) < 1) { printf("connection select failed:%s\n", strerror(errno)); exit(0); } if (!FD_ISSET(sock, &writeSet)) { printf("we got a select condition, but not for the socket, so wtf...\n"); exit(0); } break; default: printf("Connect failed:%s\n", strerror(errno)); exit(0); } } int sockError = 0; socklen_t sockErrorLen = sizeof(sockError); if (getsockopt(sock, SOL_SOCKET, SO_ERROR, &sockError, &sockErrorLen) == -1) { printf("connection failed:%s\n", strerror(errno)); exit(0); } else if (sockError != 0) { printf("socket error:%s\n", strerror(sockError)); exit(0); } printf("Connected, apparently\n"); int inFD = open(argv[1], O_RDONLY); if (inFD < 1) { printf("Couldn't open input file %s:%s\n", argv[1], strerror(errno)); exit(0); } int outFD = open(argv[2], O_WRONLY | O_CREAT | O_TRUNC, 00666); if (outFD < 1) { printf("Couldn't open output file %s:%s\n", argv[2], strerror(errno)); exit(0); } char* inBuff = (char *)alloca(BUFF_SIZE); char *outBuff = (char *)alloca(BUFF_SIZE); int inSize = 0, outSize = 0; bool selectSockIn, selectSockOut; bool fileEOF = false, sockEOF = false; bool shutdownWriteSock = false, done = false; int ret; selectSockIn = true; selectSockOut = true; int maxFD = sock; if (inFD > maxFD) maxFD = inFD; if (outFD > maxFD) maxFD = outFD; maxFD++; int counter = 0; while (!done) { fd_set writeSet, readSet; printf("looping %d %d %d\n", counter++, selectSockOut, selectSockIn); FD_ZERO(&writeSet); FD_ZERO(&readSet); if (selectSockIn) FD_SET(sock, &readSet); if (selectSockOut) FD_SET(sock, &writeSet); ret = select(maxFD, &readSet, &writeSet, NULL, NULL); if (ret < 0) { printf("select returned error:%s\n", strerror(errno)); exit(0); } if (!fileEOF && inSize == 0) { inSize = read(inFD, inBuff, BUFF_SIZE); printf("read %d bytes from file\n", inSize); if (inSize == 0) { fileEOF = true; printf("file read at eof, closing\n"); close(inFD); } else if (inSize < 0) { printf("file read error:%s\n", strerror(errno)); exit(0); } } if (inSize > 0) { ret = write(sock, inBuff, inSize); printf("wrote %d bytes to socket\n", ret); if (ret < inSize) { if (errno == EWOULDBLOCK) { printf("EWOULDBLOCK on writing\n"); } else { printf("socket write error:%s\n", strerror(errno)); exit(0); } } else { inSize = 0; } } if (fileEOF && inSize == 0 && !shutdownWriteSock) { shutdown(sock, SHUT_WR); printf("shutting down socket for writing\n"); shutdownWriteSock = true; } selectSockOut = (inSize > 0 || (fileEOF && !shutdownWriteSock)); if (FD_ISSET(sock, &readSet)) { outSize = read(sock, outBuff, BUFF_SIZE); printf("read %d bytes from socket\n", outSize); if (outSize == 0) { sockEOF = true; shutdown(sock, SHUT_RD); printf("shutting down socket for reading\n"); } } if (outSize > 0) { ret = write(outFD, outBuff, outSize); printf("wrote %d bytes to file\n", ret); if (ret < outSize) { printf("file write error:%s\n", strerror(errno)); exit(0); } outSize = 0; } if (sockEOF) { close(outFD); printf("closing output file, done\n"); done = true; } selectSockIn = (!sockEOF && outSize == 0); } } --------------Boundary-00=_WTYBPR2B1G9EIHSAGJ1B Content-Type: text/x-c++src; charset="us-ascii"; name="plainTCPEchoServer.C" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="plainTCPEchoServer.C" // $Id: plainTCPEchoServer.C,v 1.3 2003/03/25 08:53:33 tim Exp $ #include #include #include #include #include #include #include #include #include #include #include /** Echo server for testing non-blocking TCP. Accepts connections on a port, and streams back whatever data it receives. */ #define BUFF_SIZE 0x8000 int main(int argc, char* argv[]) { int sock; sock = socket(PF_INET, SOCK_STREAM, 0); if (sock == -1) { printf("Socket creation failed:%s\n", strerror(errno)); exit(0); } int optval = 1; if (setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(optval)) == -1) { printf("setsockopt:%s\n", strerror(errno)); exit(0); } sockaddr_in sin; sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl(INADDR_ANY); sin.sin_port = htons(short(8888)); memset(sin.sin_zero, 0, sizeof(sin.sin_zero)); if (bind(sock, (sockaddr*)&sin, sizeof(sin)) == -1) { printf("bind:%s\n", strerror(errno)); exit(0); } printf("bound\n"); if (listen(sock, 5) == -1) { printf("listen:%s\n", strerror(errno)); exit(0); } fcntl(sock, F_SETFL, fcntl(sock, F_GETFL) | O_NONBLOCK); // select for read fd_set readSet; FD_ZERO(&readSet); FD_SET(sock, &readSet); if (select(sock + 1, &readSet, NULL, NULL, NULL) < 1) { printf("passive connection select failed:%s\n", strerror(errno)); exit(0); } if (!FD_ISSET(sock, &readSet)) { printf("we got a select condition, but not for the socket, so wtf...\n"); exit(0); } printf("listened\n"); int newfd = accept(sock, NULL, NULL); printf("accepting\n"); while (newfd == -1) switch (errno) { case EWOULDBLOCK: printf("selecting for accept\n"); if (select(sock + 1, &readSet, NULL, NULL, NULL) < 1) { printf("passive connection select failed:%s\n", strerror(errno)); exit(0); } break; default: printf("accept failed:%s\n", strerror(errno)); exit(0); } fcntl(newfd, F_SETFL, fcntl(sock, F_GETFL) | O_NONBLOCK); printf("accepted\n"); char* inBuff = (char *)alloca(BUFF_SIZE); int inSize = 0; bool selectSockIn, selectSockOut; bool sockEOF = false; bool done = false; int ret; selectSockIn = true; selectSockOut = true; int maxFD = newfd; if (sock > maxFD) maxFD = sock; maxFD++; int counter = 0; while (!done) { printf("looping %d %d %d\n", counter++, selectSockIn, selectSockOut); fd_set writeSet, readSet; FD_ZERO(&writeSet); FD_ZERO(&readSet); if (selectSockIn) FD_SET(newfd, &readSet); if (selectSockOut) FD_SET(newfd, &writeSet); ret = select(maxFD, &readSet, &writeSet, NULL, NULL); if (ret < 0) { printf("select returned error:%s\n", strerror(errno)); exit(0); } if (FD_ISSET(newfd, &readSet)) { inSize = read(newfd, inBuff, BUFF_SIZE); printf("read %d bytes from socket\n", inSize); if (inSize == 0) { sockEOF = true; printf("sock read at eof\n"); } else if (inSize < 0) { printf("socket read error:%s\n", strerror(errno)); exit(0); } } if (inSize > 0) { ret = write(newfd, inBuff, inSize); printf("wrote %d bytes to socket\n", ret); if (ret < inSize) { if (errno != EWOULDBLOCK) { printf("socket write error:%s\n", strerror(errno)); exit(0); } else { printf("EWOULDBLOCK on writing\n"); } } else { inSize = 0; } } if (sockEOF && inSize == 0) { close(newfd); done = true; } selectSockOut = (inSize > 0 || sockEOF); selectSockIn = (!sockEOF && inSize == 0); } } --------------Boundary-00=_WTYBPR2B1G9EIHSAGJ1B Content-Type: text/plain; charset=us-ascii -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ --------------Boundary-00=_WTYBPR2B1G9EIHSAGJ1B--