Mail Archives: cygwin/2003/04/21/09:54:14
--------------C1058CCDBF3398078A25FA9D
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
There have been feelings that the source of this problem is with the Windows
socket interface, but it doesn't seem to fit with the descriptions others
people have provided. My original report is attached.
Max Bowsher <maxb AT ukf DOT net> believes it's has to do with reference counting,
but I don't understand what he means by it. There must already be some
kind of reference counting because the socket does not close until the
the last process with an open path to the socket either exits or explicitly
closes that path.
The problem is not dependant upon data pending in the TCP send window as
the example program I created sends no data and yet still demonstrates
the problem.
Clewis AT mobilecom DOT com suggested the problem might be that the last process to
close the socket is not the same process that opened it.
I don't mean to harp on this issue; it's just that it is a fairly significant
problem that does not seem to have been adequately explained.
Brian
( bcwhite AT precidia DOT com )
-------------------------------------------------------------------------------
the difference between theory and practice is less in theory than in practice
--------------C1058CCDBF3398078A25FA9D
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Return-path: <cygwin-return-72086-bcwhite=precidia DOT com AT cygwin DOT com>
Received: from sources.redhat.com [66.187.233.205]
by jordan.precidia.com with smtp (Exim 3.35 #1 (Debian))
id 195Tal-0000tD-00; Tue, 15 Apr 2003 12:47:31 -0400
Received: (qmail 17682 invoked by alias); 15 Apr 2003 16:46:04 -0000
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
Precedence: bulk
List-Unsubscribe: <mailto:cygwin-unsubscribe-bcwhite=precidia DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Received: (qmail 17666 invoked from network); 15 Apr 2003 16:46:03 -0000
Received: from unknown (HELO mx2.magma.ca) (206.191.0.250)
by sources.redhat.com with SMTP; 15 Apr 2003 16:46:03 -0000
Received: from mail4.magma.ca (mail4.magma.ca [206.191.0.222])
by mx2.magma.ca (Magma Relay Server) with ESMTP id h3FGk4mD021393
for <cygwin AT cygwin DOT com>; Tue, 15 Apr 2003 12:46:04 -0400
Received: from ottgate.precidia.com (ottgate.precidia.com [206.191.32.162])
by mail4.magma.ca (Magma's Mail Server) with ESMTP id h3FGk3v7012751
for <cygwin AT cygwin DOT com>; Tue, 15 Apr 2003 12:46:03 -0400 (EDT)
Received: from tolkien.ott.precidia.com [10.0.1.2] (mail)
by ottgate.precidia.com with esmtp (Exim 3.35 #1 (Debian))
id 195TZL-0000KN-00; Tue, 15 Apr 2003 12:46:03 -0400
Received: from adams.ott.precidia.com (precidia.com) [10.0.2.138]
by tolkien.ott.precidia.com with esmtp (Exim 3.35 #1 (Debian))
id 195TZQ-0003Bj-00; Tue, 15 Apr 2003 12:46:08 -0400
Message-ID: <3E9C374E DOT 4256358D AT precidia DOT com>
Date: Tue, 15 Apr 2003 12:46:06 -0400
From: Brian White <bcwhite AT precidia DOT com>
Organization: Precidia Technologies http://www.precidia.com/
X-Accept-Language: en
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: bug: tcp RST instead of FIN if child exists after parent closes path
Content-Type: multipart/mixed;
boundary="------------A10D7981ECC01483E6D26E19"
X-MailScanner: Found to be clean
X-MailScanner-SpamCheck: not spam, SpamAssassin (score=-99.5, required 8,
NOSPAM_INC, SPAM_PHRASE_00_01, UNSUB_PAGE, USER_IN_WHITELIST,
X_ACCEPT_LANG)
X-Mozilla-Status2: 00000000
--------------A10D7981ECC01483E6D26E19
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
I came across this while trying to get amanda to compile for CygWin and
wrote up a small test program (attached) to demonstrate the problem.
If a process opens a socket and then passes that path to a child process,
the path is now open by both. As long as the child program exits _before_
the parent closes the socket, then everything is fine:
root AT watertown ~/build
$ ./tcptest --before griffon 9
connection established
child process has exited and closed socket path
closing main program's socket path
test complete
Here's the output from tcpdump of the session. Everything is correct.
Notice that the FINs don't occur until 1 second after the initial handshake
is completed, or after the child "sleep 1" is completed.
12:18:06.014140 watertown.ott.precidia.com.3669 > griffon.ott.precidia.com.discard: S 2389465615:2389465615(0) win 16384 <mss 1460,nop,nop,sackOK> (DF)
12:18:06.014247 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3669: S 690327584:690327584(0) ack 2389465616 win 5840 <mss 1460,nop,nop,sackOK> (DF)
12:18:06.014654 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3669: S 690327584:690327584(0) ack 2389465616 win 5840 <mss 1460,nop,nop,sackOK> (DF)
12:18:06.014660 watertown.ott.precidia.com.3669 > griffon.ott.precidia.com.discard: . ack 1 win 17520 (DF)
12:18:07.141048 watertown.ott.precidia.com.3669 > griffon.ott.precidia.com.discard: F 1:1(0) ack 1 win 17520 (DF)
12:18:07.141676 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3669: F 1:1(0) ack 2 win 5840 (DF)
12:18:07.141976 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3669: F 1:1(0) ack 2 win 5840 (DF)
12:18:07.142235 watertown.ott.precidia.com.3669 > griffon.ott.precidia.com.discard: . ack 2 win 17520 (DF)
Next I run the test program telling it to have the child exit _after_ the
main program has closed the socket.
root AT watertown ~/build
$ ./tcptest --after griffon 9
connection established
closing main program's socket path
child process has exited and closed socket path
test complete
This time, when the parent closes the socket path with the child still running,
then the connection is aborted with a RST instead of a FIN. This is not good.
Note that the reset is still 1 second after the initial handshake showing that
it only happens at the point when the child exits.
12:18:14.554661 watertown.ott.precidia.com.3768 > griffon.ott.precidia.com.discard: S 2396409222:2396409222(0) win 16384 <mss 1460,nop,nop,sackOK> (DF)
12:18:14.554774 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3768: S 693255112:693255112(0) ack 2396409223 win 5840 <mss 1460,nop,nop,sackOK> (DF)
12:18:14.555174 griffon.ott.precidia.com.discard > watertown.ott.precidia.com.3768: S 693255112:693255112(0) ack 2396409223 win 5840 <mss 1460,nop,nop,sackOK> (DF)
12:18:14.555180 watertown.ott.precidia.com.3768 > griffon.ott.precidia.com.discard: . ack 1 win 17520 (DF)
12:18:15.646499 watertown.ott.precidia.com.3768 > griffon.ott.precidia.com.discard: R 2396409223:2396409223(0) win 0 (DF)
Running the same program under Linux shows that in both cases the session
closes with a FIN.
Brian
( bcwhite AT precidia DOT com )
-------------------------------------------------------------------------------
Seize the moment! Live now. Make "now" always the most important time. -- JLP
--------------A10D7981ECC01483E6D26E19
Content-Type: text/plain; charset=us-ascii;
name="tcptest.c"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="tcptest.c"
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <cygwin/in.h>
#include <netdb.h>
int main(int argc, char* argv[])
{
struct sockaddr_in addr;
struct hostent* host;
int before;
int sock;
int pid;
int status;
if (argc != 4 || (strcmp(argv[1],"--before") != 0 && strcmp(argv[1],"--after") != 0)) {
fprintf(stderr,"Use: %s <--before|--after> <host> <port>\n",argv[0]);
exit(1);
}
if (strcmp(argv[1],"--before") != 0) {
before = 1;
} else {
before = 0;
}
sock = socket(AF_INET,SOCK_STREAM,0);
if (sock == -1) {
perror("could not create socket");
exit(1);
}
host = gethostbyname(argv[2]);
if (host == NULL) {
herror("could not resolve hostname");
exit(1);
}
bzero(&addr,sizeof(addr));
addr.sin_family = host->h_addrtype;
memcpy(&addr.sin_addr,host->h_addr,sizeof(addr.sin_addr));
addr.sin_port = htons(atoi(argv[3]));
if (connect(sock,(struct sockaddr*)&addr,sizeof(addr)) == -1) {
perror("could not connect to host");
exit(1);
}
fprintf(stderr,"connection established\n");
pid = fork();
switch (pid) {
case -1: /* error */
perror("could not fork");
exit(1);
case 0: /* child */
execlp("sleep","sleep","1",NULL);
perror("could not execlp 'sleep 1'");
exit(1);
default: /* parent */
break;
}
if (before) {
fprintf(stderr,"closing main program's socket path\n");
close(sock);
}
if (waitpid(pid,&status,0) == -1) {
perror("could not wait for child");
exit(1);
}
fprintf(stderr,"child process has exited and closed socket path\n");
if (!before) {
fprintf(stderr,"closing main program's socket path\n");
close(sock);
}
fprintf(stderr,"test complete\n");
}
--------------A10D7981ECC01483E6D26E19
Content-Type: text/plain; charset=us-ascii
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
--------------A10D7981ECC01483E6D26E19--
--------------C1058CCDBF3398078A25FA9D
Content-Type: text/plain; charset=us-ascii
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
--------------C1058CCDBF3398078A25FA9D--
- Raw text -