X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Sun, 16 Dec 2007 14:42:21 +0100 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: VM and non-blocking writes Message-ID: <20071216134221.GD18860@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: <47616D31 DOT 7090002 AT 4raccoons DOT com> <20071213175934 DOT GB25863 AT calimero DOT vinschen DOT de> <476185AF DOT 5000906 AT 4raccoons DOT com> <20071214111508 DOT GD25863 AT calimero DOT vinschen DOT de> <20071214143230 DOT GK25863 AT calimero DOT vinschen DOT de> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="xHFwDpU9dbj6ez1V" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.16 (2007-06-09) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com --xHFwDpU9dbj6ez1V Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Dec 15 12:29, Robert Pendell wrote: > Corinna Vinschen wrote: > > Obviously I searched wrong. There a reports about this behaviour > > since at least 1998 and it has never been fixed. These two links > > might be interesting: > > > > http://support.microsoft.com/kb/q201213/ > > http://tinyurl.com/2brokp > > Do you have the test case you used for the pure win32 mode? Sure, but before we start with this, a note: I'm contemplating the idea to workaround this problem in Cygwin (not for 1.5.25, but in the main trunk) by caping the number of bytes in a single send call, according to the patch Lev sent in http://www.cygwin.com/ml/cygwin-patches/2006-q2/msg00031.html. Lev, are you interested in reworking your patch (minus the pipe stuff) to match current CVS? Is there any gain in raising SO_SNDBUF/SO_RCVBUF to a value > 8K, especially in the light of my experiences commented on in net.cc, function fdsock()? Back to the testcase. Source attached. I created it so that it can be built as Cygwin or Linux executable $ gcc -g -o nbcheck nbcheck.c as well as native Windows application using mingw: $ gcc -g -mno-cygwin -o nbcheck-nat nbcheck.c -lws2_32 It takes the size of the user data buffer as optional argument, defaulting to 100,000,000 bytes. > If you do > then maybe I can try and push to get this fixed for the next service > pack release for both XP and Vista as well as Server 2008. This will > especially be the case if it can be easily reproduced. Reproducing the issue is as easy as Wayne described. Just start a client application which connects but never reads, for instance by using the python sequence Wayne used in his mail: $ python import socket s = socket.socket() s.connect(("name-of-windows-box", 12345)) If you add a second arbitrary argument, the testcase tries to write always in 10,000 bytes chunks. This shows how select starts to block at one point, in my case on XP SP2 after writing 190,000 bytes. Result on Linux: $ ./nbcheck 500000000 listening to port 12345 host linux-box (10.0.0.1) got connection from 10.0.0.3 accepted socket is nonblocking now buffer size is 100000000 bytes trying to write 100000000 bytes 65536 bytes written trying to write 99934464 bytes 147456 bytes written [HANG in select] $ ./nbcheck 100000000 listening to port 12345 host linux-box (10.0.0.1) got connection from 10.0.0.3 accepted socket is nonblocking now buffer size is 100000000 bytes trying to write 100000000 bytes 65536 bytes written trying to write 99934464 bytes 147456 bytes written [HANG in select] $ ./nbcheck 100000000 x listening to port 12345 host linux-box (10.0.0.1) got connection from 10.0.0.3 accepted socket is nonblocking now buffer size is 100000000 bytes trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written [HANG in select] Result on Windows; $ ./nbcheck-nat 500000000 listening to port 12345 host windows-box (10.0.0.2) got connection from 10.0.0.3 accepted socket is nonblocking now buffer size is 500000000 bytes trying to write 500000000 bytes Err: 10055 hit return to exit $ ./nbcheck-nat 100000000 listening to port 12345 host windows-box (10.0.0.2) got connection from 10.0.0.3 accepted socket is nonblocking now buffer size is 100000000 bytes trying to write 100000000 bytes 100000000 bytes written hit return to exit $ ./nbcheck-nat 100000000 x listening to port 12345 host windows-box (10.0.0.2) got connection from 10.0.0.3 accepted socket is nonblocking now buffer size is 100000000 bytes trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written trying to write 10000 bytes 10000 bytes written [WAIT in select for 5 seconds] trying to write 10000 bytes 10000 bytes written [WAIT in select for 14 seconds] trying to write 10000 bytes 10000 bytes written [WAIT in select for about 60 seconds] trying to write 10000 bytes 10000 bytes written [WAIT in select for about 60 seconds] [a couple of times, but not always the same] trying to write 10000 bytes 10000 bytes written [HANG in select] The hang occured in one testruns after 160,000 bytes, in another after 190,000 bytes. I have no idea if there's some sort of rule behind that. > A source and > binary version will be useful for this. Creating a binary is most easy, see above. > I am in the tech beta group for > Vista SP1, XP SP3, and Server 2008 so I can at least remind them of this > bug and show them a test case. No guarantees that it will be fixed. Actually, given that this behaviour is known since at least 10 years, I doubt that it will even be accepted as a bug. But you never should give up hope, right? :) Thanks for your offer, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat --xHFwDpU9dbj6ez1V Content-Type: text/x-c++src; charset=us-ascii Content-Disposition: attachment; filename="nbcheck.c" #include #include #ifdef _WIN32 #include #include WSADATA wsadata; #define SOCKLEN_T int #else // Assume Unix-like system #include #include #include #include #include #include #include #include #include #include #include #define SOCKET int #define WSADATA int #define WSAStartup(a,b) #define SOCKET_ERROR -1 #define SOCKLEN_T socklen_t #define WSAGetLastError() (errno) #define SD_BOTH SHUT_RDWR #define closesocket close #define WSACleanup() #endif int main(int argc, char **argv) { int i; SOCKET fd, fd2; struct hostent *hp; struct protoent *pp; char hostname[64]; struct sockaddr_in lAddr, rAddr; char* data; size_t datalen, datapos; WSAStartup (MAKEWORD(2,2), &wsadata); gethostname(hostname, 64); pp = getprotobyname("tcp"); hp = gethostbyname(hostname); setbuf (stdout, NULL); assert(pp && hp); fd = socket(AF_INET, SOCK_STREAM, pp->p_proto); assert(fd != SOCKET_ERROR); lAddr.sin_family = hp->h_addrtype; memcpy(&lAddr.sin_addr.s_addr, hp->h_addr, sizeof(lAddr.sin_addr.s_addr)); lAddr.sin_port = htons(12345); i = bind(fd, (struct sockaddr *)&lAddr, sizeof(lAddr)); assert(i != SOCKET_ERROR); printf("listening to port %d host %s (%s)\n", ntohs(lAddr.sin_port), hostname, inet_ntoa(lAddr.sin_addr)); i = listen(fd, 5); assert(i != SOCKET_ERROR); i = sizeof(rAddr); memset(&rAddr, 0, sizeof(rAddr)); fd2 = accept(fd, (struct sockaddr *)&rAddr, (SOCKLEN_T *) &i); assert(fd2 != SOCKET_ERROR); printf("got connection from %s\n", inet_ntoa(rAddr.sin_addr)); #ifdef _WIN32 { u_long on = 1; i = ioctlsocket (fd2, FIONBIO, &on); } #else i = fcntl(fd2, F_SETFL, O_NONBLOCK); #endif assert(i != SOCKET_ERROR); printf("accepted socket is nonblocking now\n"); datalen = argc > 1 ? strtol (argv[1], NULL, 0) : 100000000; data = (char *) malloc(datalen); assert(data); printf("buffer size is %lu bytes\n", (unsigned long) datalen); datapos = 0; while (datapos < datalen) { fd_set wfds; FD_ZERO(&wfds); FD_SET(fd2, &wfds); i = select(fd2 + 1, NULL, &wfds, NULL, NULL); assert(i == 1); printf("trying to write %d bytes\n", (int) (argc > 2 ? 10000 : datalen - datapos)); #if 0 // Same effect as send() on Windows, not available on Unix { DWORD ret; WSABUF iov[1]; iov[0].buf = data + datapos; iov[0].len = argc > 2 ? 10000 : datalen - datapos; i = WSASendTo (fd2, iov, 1, &ret, 0, NULL, 0, NULL, NULL); if (i != SOCKET_ERROR) i = ret; } #else i = send (fd2, data + datapos, argc > 2 ? 10000 : datalen - datapos, 0); #endif if (i == SOCKET_ERROR) { printf ("Err: %d\n", WSAGetLastError ()); break; } else printf("%d bytes written\n", i); datapos += i; assert(datapos <= datalen); } shutdown (fd2, SD_BOTH); closesocket (fd2); printf("hit return to exit "); getchar(); WSACleanup (); return 0; } --xHFwDpU9dbj6ez1V Content-Type: text/plain; charset=us-ascii -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ --xHFwDpU9dbj6ez1V--