delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin-developers/2002/07/29/14:42:33

Mailing-List: contact cygwin-developers-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-developers-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin-developers/>
List-Post: <mailto:cygwin-developers AT cygwin DOT com>
List-Help: <mailto:cygwin-developers-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-developers-owner AT cygwin DOT com
Delivered-To: mailing list cygwin-developers AT cygwin DOT com
Message-ID: <005801c23730$02304170$6132bc3e@BABEL>
From: "Conrad Scott" <Conrad DOT Scott AT dsl DOT pipex DOT com>
To: <cygwin-developers AT cygwin DOT com>
Cc: "Pierre A. Humblet" <Pierre DOT Humblet AT ieee DOT org>
References: <010901c23724$96e5d430$6132bc3e AT BABEL> <3D4581E4 DOT BB580995 AT ieee DOT org>
Subject: Re: TCP problems
Date: Mon, 29 Jul 2002 19:44:44 +0100
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000

"Pierre A. Humblet" <Pierre DOT Humblet AT ieee DOT org> wrote:
> Conrad Scott wrote:
> > *) On win98 (and possibly other non-NT systems) sockets don't
seem
> > to be released properly so with a long-running server you get
> > WSAENOBUFS errors (sooner or later) and no clients can attach
> > until the server is restarted.  This is what I'm trying to
> > understand right now (w/ no success as yet) --- an
"equivalent"
> > server using winsock2 directly doesn't suffer from this
problem.
> >
> > *) There are a couple of reported bugs I've come across in the
> > MSDN archives that need to be worked around but aren't
currently
> > (AFAICT).  For example, see "BUG: Closesocket() on a
Duplicated
> > Socket Fails to Clean Up"
> >
(http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q198663&)
> > and "INFO: WSA_FLAG_OVERLAPPED Is Needed for Non-Blocking
Sockets"
> >
(http://support.microsoft.com/default.aspx?scid=kb;[LN];Q179942).
>
> Those two are the same, AFAIK.
> The problem occurs when the primary socket is closed before the
> duplicated socket (in another process). Does your "equivalent"
> server do that?

Sorry: I should have been clearer there.  No: my test cygwin
server doesn't duplicate any of its sockets (AFAICT etc. but I'm
pretty sure).  It's a really simple server: blocking accept,
read/write on the new file descriptor, then shutdown/close it and
back to a blocking accept.  And it still hits the WASENOBUFS wall
eventually (altho' it can be delayed by registry patches to
increase various TCP parameters).

> The solution I implemented in some test code (and which runs
fine,
> but uses a non-unix "close on fork" fcntl) is the second one,
> i.e. "The other possibility...".
> I have scratched my head about the "dummy tcp socket" and tried
> various things, without success. Have you experimented with
that?

I haven't experimented with it yet and it does look wierd.  The
code would need to detect a close of a duplicated socket so it
would need a new flag in the fhandler_socket structure to do it
right (I suppose I could just add a dummy socket/closesocket call
regardless to see if it has any affect).  In general I've done
less work with dup(2) and fork(2) than the other problems so far.

> > One idea I've had is to extend the semphore work I put into
the
> > UNIX domain socket patch to allow the DLL to detect the last
close
> > of a socket if it's been duplicated by whatever means.  This
would
> > allow the DLL to close the socket "properly" (e.g.
non-blocking +
> > shutdown(2) + linger as appropriate).
>
> I am not sure this does it (perhaps I don't understand what you
mean).
> As I recall, calling shutdown makes the socket to not appear
> CLOSE_WAIT in netstat -a, but you still get the WSAENOBUFS after
> a while. Again, the key is to delay closing the primary socket.

The idea would be to detect the last close of a given socket
system wide; so it wouldn't matter whether the parent or child or
whatever was the last to close the socket.  Thus, now that we know
it's the last close and thus there can be no other operations
outstanding on the socket (or none that we need to worry about:
the code is closing the socket after all), we can close it with
shutdown and linger delays etc.  This solves the problem with a
client closing a socket w/o shutdown and exiting, thus leading to
data loss, which the current linger mod. in fhandler_socket::close
addresses.

It doesn't solve the "must close first socket last of all"
problem: I wanted to see how far we could get without that being
done anywhere, especially as I can't see any fix for that short of
drastic surgery (all sockets opened in the cygserver that keeps
them until a client detects last close . . .?  yuck but possible
except for systems where sockets are used to communicate with
cygserver: oops).

Thanks for the comments even if none of them suggest easy ways
around these problems :-)

// Conrad



- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019