delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2015/06/15/12:39:40

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; q=dns; s=
default; b=Q3Rt1enlJuvufmP2XfQvJaYpEtNlq97XsSBERI8YVHJICGinLhElg
pCYdXJp++R9ZBt0vJKDuPr+Lb2deynndHRiN/jqxnYRe3+tPHvvjLjpsSaBxBhwe
/Aw72CAzzAA5cHX+2ZJ10+g0sfW6U2GngQaz6LnNz+uj6LwKkJjE9g=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; s=default;
bh=pjSPa+T+t/EocCtSLaxH0yu2cA0=; b=sbfTRDzHCt21oFJkGEb6e5xSpbxg
YziFLQGq3RwxVF8KTkRoVBNmPBmNOiS9kZtT4cXS7RGLwGsvpxqIxCrKP9RI3qrN
kZNAABSoJ7WTQtZiSyqEmj/29Y6Vd4GFkPEw48psxh47iKafruOkhcYLWztbmTUY
hiIBaBydD4BMY4o=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-5.4 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2
X-HELO: calimero.vinschen.de
Date: Mon, 15 Jun 2015 18:39:22 +0200
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: deadlock/hang when calling close()/connect() at the same time on the same socket
Message-ID: <20150615163922.GW31537@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <557AFD15 DOT 4070802 AT edman007 DOT com>
MIME-Version: 1.0
In-Reply-To: <557AFD15.4070802@edman007.com>
User-Agent: Mutt/1.5.23 (2014-03-12)

--JWJEtCrVvH5hpatL
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Ed,

On Jun 12 11:39, Ed Martin wrote:
> It appears that opening a socket, and then calling connect() on it and
> from another thread calling close() on the socket while it's still in
> connect() results in a deadlock. Furthermore in this state the thread
> cannot be canceled and connect() will never return (my testcase uses
> pthread_cancel(), but it happens without that as well)

Thanks for the testcase.

I tried this with Cygwin 2.0.4 on 32 bit Windows 7 and 64 bit Windows
8.1.  In both cases I get

  $ ./bug
  Test started
  connect: Bad address
  no bug

The "Bad address" isn't exactly right.  I changed that to return the
same error codes as if shutdown has been called.  Note that there's no
hang.  I can't reproduce a deadlock.

The difference is, on Linux connect will continue to hang until the call
to pthread_cancel, while on Cygwin it will return with an error message
after you call close.  I don't see that this behaviour can be emulated
under Cygwin due to the way Windows socket event handling works (which
is what Cygwin uses under the hood).  Anyway, either way should be fine
since it unblocks the connect call.

However, calling close on a descriptor while performing a system call
on this descriptor in another thread is undefined.  Even the Linux
man page for close warns:

  It  is  probably  unwise to close file descriptors while they may be in
  use by system calls in other threads in the same process.  Since a file
  descriptor  may  be reused, there are some obscure race conditions that
  may cause unintended side effects.

See, e.g http://linux.die.net/man/2/close

In Cygwin the problem is that a close() call also removes objects and
datastructures connected to the descriptor.  Calling close on a
descriptor in one thread ultimately lets other, still-running system
calls in other threads access wrong memory or synchronization objects.

What you should do, in theory, is to to use nonblocking sockets in
conjunction with select, or signal the blocking thread so connect
returns with EINTR, and only then close the socket.

The problem with the latter approach is that it won't work with socket
functions in Cygwin up to 2.0.4 :(

The reason is that SA_RESTART is enforced in all threads not being the
main thread for some reason.  The code in question pretty much looks
like outdated behaviour.

I applied patches to fix or workaround the problems outlined above
and uploaded new developer snapshots to https://cygwin.com/snapshots/
Please give'em a try.


Thanks,
Corinna

--=20
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

--JWJEtCrVvH5hpatL
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJVfv+6AAoJEPU2Bp2uRE+gS80P/1Ob0LnCVEjJvr4naVDx9zK+
HmMMVM4VK6oQNrYaUFYVcUsiQOc1s4VQy1qk0oCDINa3NsJmqobZQZVdKyX4c961
HYBg+Ky+WD1Irkk5C0wXEBMIu2qFSWMgpTVZWtwgiaofRWf9q6P+2F1Y3bcRqLnV
aMz2vtSGhnD5HI0Jx90omxQdnERqiDr4495hBJd1+sG4ACRSZzqSkDdpaqJVyrHm
JpoKJ89YvLIfaPKPlUCseUAElq1/78ybRKbx7BCWHjZ2rszSNkxuSDJHSrAquYWz
Ca3qEbDTqVJqJFJvTPLBHqiBv40ODrwXs6fFnKFA9CZ2VbftbBQ2Z9thAcviYJc5
J6h8S8yR2KIPiF557bysGaw7jvNXHsupDVQWfeMcd4e/1303PwDkmgpm+WhPboR6
bOabDYcDA3Kix0XcEA2UiWGdi463H0NK0Jnucp0GFf544j994QjjsHj8iveU8vpK
rnOHjp1psBbtyId3arGk8w9mb7cZuwJ879WXE1v3SFaklO03MIYjzVJUsBSxmAns
nIWeVe8CyO2f7oRODcJwEd8yAY6h2OUcEQ6qDLDYc/P5qxM99dcaajEM9XdVQROE
xIr1Sfx+pi6CQRuakW/x/Kn1uhpP06UdtdAEXsLH9cFMcxpycr94DZART5m7318T
DfBL6JmHdzIYhZoQIHHg
=5Nfm
-----END PGP SIGNATURE-----

--JWJEtCrVvH5hpatL--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019