delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2020/12/04/08:52:06

X-Recipient: archive-cygwin AT delorie DOT com
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org F317F3850418
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
header.from=huarp.harvard.edu
Authentication-Results: sourceware.org;
spf=pass smtp.mailfrom=allen AT huarp DOT harvard DOT edu
Subject: Re: Unix Domain Socket Limitation?
To: Ken Brown <kbrown AT cornell DOT edu>, cygwin <cygwin AT cygwin DOT com>
References: <71490665-31b0-f63c-74da-461a053fac21 AT huarp DOT harvard DOT edu>
<55ea1649-1979-6238-75ab-69100c22e069 AT cornell DOT edu>
<4260ad1b-4ab2-fa36-fd0e-7c9644560114 AT huarp DOT harvard DOT edu>
<38a82f82-1ef9-768e-7d3e-15f63147e188 AT cornell DOT edu>
<a1f6e9af-7c0b-4d3f-4198-1c7bff4869dc AT huarp DOT harvard DOT edu>
<16165727-f614-1543-70bc-36457ddbf260 AT cornell DOT edu>
<75d1315b-5a56-a2e5-310d-6ac33a3cf17c AT huarp DOT harvard DOT edu>
<f5373cd4-1d3e-500a-a967-1c558541e720 AT cornell DOT edu>
<85c9c70f-c016-0f88-099e-5c772adbc648 AT huarp DOT harvard DOT edu>
<fb523694-7775-1d7a-db55-27ccbd6d157c AT huarp DOT harvard DOT edu>
<1a0944b7-5924-31ab-7198-a5c311f39e06 AT huarp DOT harvard DOT edu>
<1c1e875a-40a0-ff9e-a119-ba77203e43ea AT cornell DOT edu>
From: Norton Allen <allen AT huarp DOT harvard DOT edu>
Message-ID: <a13ab85d-bee7-71e3-41d0-1a67422a859f@huarp.harvard.edu>
Date: Fri, 4 Dec 2020 08:51:02 -0500
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.5.0
MIME-Version: 1.0
In-Reply-To: <1c1e875a-40a0-ff9e-a119-ba77203e43ea@cornell.edu>
X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
NICE_REPLY_A, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS,
TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
Errors-To: cygwin-bounces AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces AT cygwin DOT com>
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id 0B4DpcCo014282

On 12/3/2020 8:11 PM, Ken Brown wrote:
> On 12/2/2020 12:30 PM, Norton Allen wrote:
>> On 11/30/2020 9:22 PM, Norton Allen wrote:
>>> Yeah, so now the example no longer blocks for me. Unfortunately 
>>> these bugs are not present in my application, so I will need to keep 
>>> working on this.
>>>
>>
>> After paring the main application down and back up, I finally 
>> narrowed in on the condition that was causing this blocking behavior. 
>> The issue arises when a client connect()s twice to the same server 
>> with non-blocking unix-domain sockets before calling select().
>>
>> There are a few pieces to this. With the client configured to 
>> connect() just once, I can see that the server's select() returns as 
>> soon as the client calls connect(), but then the server's accept() 
>> blocks until the client calls select(). That is not proper 
>> non-blocking behavior, but it appears that the implementation under 
>> Cygwin does require that client and server both be communicating 
>> synchronously to accomplish the connect() operation.
>>
>> I tried running this under Ubuntu 16.04 and found that connect() 
>> succeeded immediately, so no subsequent select() is required, and 
>> there does not appear to be a possibility for this collision. That 
>> proves to hold true even if the server is not waiting in select() to 
>> process the connect() with accept().
>>
>> A workaround for this issue may be to keep the socket blocking until 
>> after connect().
>>
>> I have pushed the new minimal example program,  'rapid_connects' to 
>> https://github.com/nthallen/cygwin_unix
>>
>> The server is run like before as:
>>
>>     $ ./rapid_connects server
>>
>> The client can be run in two different modes. To connect with just 
>> one socket:
>>
>>     $ ./rapid_connects client1
>>
>> To connect with two:
>>
>>     $ ./rapid_connects client2
>>
>> My immediate strategy will be to develop a workaround for my project. 
>> Having spent a day inside cygwin1.dll, I can see that I have a steep 
>> learning curve to make much of a contribution there.
>
> I'm traveling at the moment and unable to do any testing, but I wonder 
> if you're bumping into an issue that was just discussed on the 
> cygwin-developers list:
>
> https://cygwin.com/pipermail/cygwin-developers/2020-December/012015.html
>
> A different workaround is described there.
>
> If it's the same issue, then I don't think it will happen with the new 
> AF_UNIX implementation.  More in a few days.
>
It does seem related.

A work around that is working for me is to do a blocking connect() and 
switch to non-blocking when that completes. In my application, the 
connect() generally occurs once at the beginning of a run, so blocking 
for a few milliseconds does not impact responsiveness.


--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019