Mail Archives: cygwin-developers/2001/11/16/04:37:10
On Thu, Nov 15, 2001 at 08:00:18PM -0700, robert bowman wrote:
> On Thursday 15 November 2001 14:21, you wrote:
> > I've dug deeply enough into this to determine that I believe the
> > problem is caused by a bug in winsock. I can get the problem to
> > manifest itself completely independently from Cygwin. See the full
> > description in the attached program, which one of my coworkers with an
> > MSDN subscription is going to forward to Microsoft to see what they
> > have to say about it.
>
> For what it's worth, we recently encountered this problem in the ONC RPC
> library. The original Sun code, and any revision I've been able to find,
> binds a local port even on the TCP protocol. The same thing happens, with the
> bind not failing, and the failure occurring on the connect.
>
> We depend on RPC heavily, and would see delays on startup when the inital
> clnt_create would fail repeatedly. The RPC attempts to use a pool of local
> ports, and will increment and retry if the bind fails -- but it doesn't.
>
> This is not a cygwin issue; we are using the MKS/DataFocus NutCracker
> toolkit. DataFocus provided the ported ONC RPC code but does not support it.
> We have been tinkering with it in-house. The bind can be eliminated for some
> improvement, in this case.
>
> There are other issues we are dealing with. I've forwarded a couple of the
> emails to another programmer at work who is also working on NT/2000 socket
> issues.
>
> Interestingly enough, on Linux, the bind also fails unless the process has
> root priveleges. However, the code only iterates on EADDRINUSE and the return
> is not checked, so the connect succeeds.
>
> I, also, wrote a native testcase with the WSA calls and got the same results.
> I did note that the OS expires the port eventually, but it takes 5 to 20
> minutes.
>
> I believe the root of the problem is that both the remote host address and
> local port are used to determine if the connection is unique. bind would fail
> if anything other than ANY_ADDR is used, so at the time of the bind it isn't
> known if the combination is unique. Only when the host address is known in
> connect, will the combination fail.
>
> Our problem was exacerbated by the fact several apps are typically started at
> the same time on one station, and they are all trying to make RPC connections
> to the server machine. The ONC RPC algo uses the pid to calculate which port
> to try first; with several clients starting and making several connection,
> there would be groups of used ports; if a connection timed out, and the next
> attempt moved into a cluster of ports being used by another app, the
> clnt_create would fail many times, before it finally iterated into fresh
> territory.
Thanks for that interesting description. There's that SO_REUSEADDR
call to setsockopt(). I wonder if that could be a help. It's
treated somewhat dangerous, though.
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Developer mailto:cygwin AT cygwin DOT com
Red Hat, Inc.
- Raw text -