X-Recipient: archive-cygwin@delorie.com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:to:cc:subject:date:message-id
	:content-type:content-id:content-transfer-encoding:mime-version;
	 q=dns; s=default; b=udknuBh1S1+71Jy2jd9QLOjX8Hi/ccafVq8dpILyGXK
	UmpbrvfF8DzrqAXq70ZIppNWBg+ZcONqNQqj6PA22a8wNo5H0DWIz6iamJDteVLR
	X+3wnuTsKQysp6PBKpBbFbMD5SsI2s/uL3bnn8D9Krt19/oryENHKE5V1iQv/nTg
	=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:to:cc:subject:date:message-id
	:content-type:content-id:content-transfer-encoding:mime-version;
	 s=default; bh=d/MuRbARU0Zs5EfyASoP2CIkZIw=; b=BS/AGZwXrHKsr4y6A
	a6NOXr0AQszJzeDvPCQt38BQMduJPgr2ihUGxbVIuFTFntI+MMFv7dGQ486a57Z9
	CKw1F9x5LqCnV5cwyxxKJNQCveFkeWbt5R9HrRF628MuV8SoxHRmrl7JnXz/NOB0
	zsWtQVUVVZXNY8Smy2Zyk6XBHw=
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=1.5 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=Hx-spam-relays-external:sk:smtp-ou, H*RU:sk:smtp-ou, HX-HELO:sk:smtp-ou, Hx-spam-relays-external:15.0.1156.6
X-HELO: smtp-outbound-1.vmware.com
From: Kevin Nomura <knomura@vmware.com>
To: "cygwin@cygwin.com" <cygwin@cygwin.com>
CC: Taylor Hutt <thutt@vmware.com>
Subject: spinlock.h timeout causing *** fatal error - add_item abort
Date: Fri, 29 Apr 2016 16:03:39 +0000
Message-ID: <D348D3EA.10AEB%knomura@vmware.com>
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
Content-Type: text/plain; charset="us-ascii"
Content-ID: <7D34DF49E9FC374B9229DB00873E37E4@vmware.com>
MIME-Version: 1.0
X-IsSubscribed: yes
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id u3TG4Ix4011256

Hi,

We occasionally see an api_fatal abort during process startup like:

*** fatal error - add_item ("somepath", "/", ...) failed, errno 1

This happens if mountinfo.init(false) in user_info::initialize is
called twice.  The error occurs on a second call when trying to add
the root mount point when it already exists.

mountinfo.init is guarded by a "spinlock" object that should only
allow one process to call it.  But the spinlock has a timeout.  After
15 seconds, it stops waiting and returns a value of 0.  The fatal
error can occur if two processes are starting around the same time
and the first process takes a long time in internal_getpwsid().  We've
seen this happen in our environment due to LDAP queries taking a long
time.  (Incidentally we are using msys, but code in spinlock.h and
shared.cc looks the same in cygwin).

To solve the aborts it is tempting to make a local fix to remove the
spinlock timeout.  I assume there was a rationale for it, and would
like to understand what tradeoff is incurred if we remove it.

- Kevin


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


