Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <3EA3F7EC.833E28B@precidia.com> Date: Mon, 21 Apr 2003 09:53:48 -0400 From: Brian White Organization: Precidia Technologies http://www.precidia.com/ X-Accept-Language: en MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: bug: tcp RST instead of FIN if child exits after parent closes path Content-Type: multipart/mixed; boundary="------------C1058CCDBF3398078A25FA9D" Note-from-DJ: This may be spam --------------C1058CCDBF3398078A25FA9D Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit There have been feelings that the source of this problem is with the Windows socket interface, but it doesn't seem to fit with the descriptions others people have provided. My original report is attached. Max Bowsher believes it's has to do with reference counting, but I don't understand what he means by it. There must already be some kind of reference counting because the socket does not close until the the last process with an open path to the socket either exits or explicitly closes that path. The problem is not dependant upon data pending in the TCP send window as the example program I created sends no data and yet still demonstrates the problem. Clewis AT mobilecom DOT com suggested the problem might be that the last process to close the socket is not the same process that opened it. I don't mean to harp on this issue; it's just that it is a fairly significant problem that does not seem to have been adequately explained. Brian ( bcwhite AT precidia DOT com ) ------------------------------------------------------------------------------- the difference between theory and practice is less in theory than in practice --------------C1058CCDBF3398078A25FA9D Content-Type: message/rfc822 Content-Transfer-Encoding: 7bit Content-Disposition: inline Return-path: Received: from sources.redhat.com [66.187.233.205] by jordan.precidia.com with smtp (Exim 3.35 #1 (Debian)) id 195Tal-0000tD-00; Tue, 15 Apr 2003 12:47:31 -0400 Received: (qmail 17682 invoked by alias); 15 Apr 2003 16:46:04 -0000 Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Received: (qmail 17666 invoked from network); 15 Apr 2003 16:46:03 -0000 Received: from unknown (HELO mx2.magma.ca) (206.191.0.250) by sources.redhat.com with SMTP; 15 Apr 2003 16:46:03 -0000 Received: from mail4.magma.ca (mail4.magma.ca [206.191.0.222]) by mx2.magma.ca (Magma Relay Server) with ESMTP id h3FGk4mD021393 for ; Tue, 15 Apr 2003 12:46:04 -0400 Received: from ottgate.precidia.com (ottgate.precidia.com [206.191.32.162]) by mail4.magma.ca (Magma's Mail Server) with ESMTP id h3FGk3v7012751 for ; Tue, 15 Apr 2003 12:46:03 -0400 (EDT) Received: from tolkien.ott.precidia.com [10.0.1.2] (mail) by ottgate.precidia.com with esmtp (Exim 3.35 #1 (Debian)) id 195TZL-0000KN-00; Tue, 15 Apr 2003 12:46:03 -0400 Received: from adams.ott.precidia.com (precidia.com) [10.0.2.138] by tolkien.ott.precidia.com with esmtp (Exim 3.35 #1 (Debian)) id 195TZQ-0003Bj-00; Tue, 15 Apr 2003 12:46:08 -0400 Message-ID: <3E9C374E DOT 4256358D AT precidia DOT com> Date: Tue, 15 Apr 2003 12:46:06 -0400 From: Brian White Organization: Precidia Technologies http://www.precidia.com/ X-Accept-Language: en MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: bug: tcp RST instead of FIN if child exists after parent closes path Content-Type: multipart/mixed; boundary="------------A10D7981ECC01483E6D26E19" X-MailScanner: Found to be clean X-MailScanner-SpamCheck: not spam, SpamAssassin (score=-99.5, required 8, NOSPAM_INC, SPAM_PHRASE_00_01, UNSUB_PAGE, USER_IN_WHITELIST, X_ACCEPT_LANG) X-Mozilla-Status2: 00000000 --------------A10D7981ECC01483E6D26E19 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I came across this while trying to get amanda to compile for CygWin and wrote up a small test program (attached) to demonstrate the problem. If a process opens a socket and then passes that path to a child process, the path is now open by both. As long as the child program exits _before_ the parent closes the socket, then everything is fine: root AT watertown ~/build $ ./tcptest --before griffon 9 connection established child process has exited and closed socket path closing main program's socket path test complete Here's the output from tcpdump of the session. Everything is correct. Notice that the FINs don't occur until 1 second after the initial handshake is completed, or after the child "sleep 1" is completed. 12:18:06.014140 watertown.ott.precidia.com.3669 > griffon.ott.precidia.com.discard: S 2389465615:2389465615(0) win 16384 (DF) 12:18:06.014247 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3669: S 690327584:690327584(0) ack 2389465616 win 5840 (DF) 12:18:06.014654 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3669: S 690327584:690327584(0) ack 2389465616 win 5840 (DF) 12:18:06.014660 watertown.ott.precidia.com.3669 > griffon.ott.precidia.com.discard: . ack 1 win 17520 (DF) 12:18:07.141048 watertown.ott.precidia.com.3669 > griffon.ott.precidia.com.discard: F 1:1(0) ack 1 win 17520 (DF) 12:18:07.141676 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3669: F 1:1(0) ack 2 win 5840 (DF) 12:18:07.141976 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3669: F 1:1(0) ack 2 win 5840 (DF) 12:18:07.142235 watertown.ott.precidia.com.3669 > griffon.ott.precidia.com.discard: . ack 2 win 17520 (DF) Next I run the test program telling it to have the child exit _after_ the main program has closed the socket. root AT watertown ~/build $ ./tcptest --after griffon 9 connection established closing main program's socket path child process has exited and closed socket path test complete This time, when the parent closes the socket path with the child still running, then the connection is aborted with a RST instead of a FIN. This is not good. Note that the reset is still 1 second after the initial handshake showing that it only happens at the point when the child exits. 12:18:14.554661 watertown.ott.precidia.com.3768 > griffon.ott.precidia.com.discard: S 2396409222:2396409222(0) win 16384 (DF) 12:18:14.554774 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3768: S 693255112:693255112(0) ack 2396409223 win 5840 (DF) 12:18:14.555174 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3768: S 693255112:693255112(0) ack 2396409223 win 5840 (DF) 12:18:14.555180 watertown.ott.precidia.com.3768 > griffon.ott.precidia.com.discard: . ack 1 win 17520 (DF) 12:18:15.646499 watertown.ott.precidia.com.3768 > griffon.ott.precidia.com.discard: R 2396409223:2396409223(0) win 0 (DF) Running the same program under Linux shows that in both cases the session closes with a FIN. Brian ( bcwhite AT precidia DOT com ) ------------------------------------------------------------------------------- Seize the moment! Live now. Make "now" always the most important time. -- JLP --------------A10D7981ECC01483E6D26E19 Content-Type: text/plain; charset=us-ascii; name="tcptest.c" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="tcptest.c" #include #include #include #include #include #include int main(int argc, char* argv[]) { struct sockaddr_in addr; struct hostent* host; int before; int sock; int pid; int status; if (argc != 4 || (strcmp(argv[1],"--before") != 0 && strcmp(argv[1],"--after") != 0)) { fprintf(stderr,"Use: %s <--before|--after> \n",argv[0]); exit(1); } if (strcmp(argv[1],"--before") != 0) { before = 1; } else { before = 0; } sock = socket(AF_INET,SOCK_STREAM,0); if (sock == -1) { perror("could not create socket"); exit(1); } host = gethostbyname(argv[2]); if (host == NULL) { herror("could not resolve hostname"); exit(1); } bzero(&addr,sizeof(addr)); addr.sin_family = host->h_addrtype; memcpy(&addr.sin_addr,host->h_addr,sizeof(addr.sin_addr)); addr.sin_port = htons(atoi(argv[3])); if (connect(sock,(struct sockaddr*)&addr,sizeof(addr)) == -1) { perror("could not connect to host"); exit(1); } fprintf(stderr,"connection established\n"); pid = fork(); switch (pid) { case -1: /* error */ perror("could not fork"); exit(1); case 0: /* child */ execlp("sleep","sleep","1",NULL); perror("could not execlp 'sleep 1'"); exit(1); default: /* parent */ break; } if (before) { fprintf(stderr,"closing main program's socket path\n"); close(sock); } if (waitpid(pid,&status,0) == -1) { perror("could not wait for child"); exit(1); } fprintf(stderr,"child process has exited and closed socket path\n"); if (!before) { fprintf(stderr,"closing main program's socket path\n"); close(sock); } fprintf(stderr,"test complete\n"); } --------------A10D7981ECC01483E6D26E19 Content-Type: text/plain; charset=us-ascii -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ --------------A10D7981ECC01483E6D26E19-- --------------C1058CCDBF3398078A25FA9D Content-Type: text/plain; charset=us-ascii -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ --------------C1058CCDBF3398078A25FA9D--