delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2012/12/21/05:33:41

X-Recipient: archive-cygwin AT delorie DOT com
X-Spam-Check-By: sourceware.org
Date: Fri, 21 Dec 2012 11:32:41 +0100
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Intermittent failures retrieving process exit codes
Message-ID: <20121221103241.GD18188@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <50C2498C DOT 2000003 AT coverity DOT com> <50C276AC DOT 9090301 AT mailme DOT ath DOT cx> <50D401EF DOT 9040705 AT coverity DOT com>
MIME-Version: 1.0
In-Reply-To: <50D401EF.9040705@coverity.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Dec 21 01:30, Tom Honermann wrote:
> I spent most of the week debugging this issue.  This appears to be a
> defect in Windows.  I can reproduce the issue without Cygwin.  I
> can't rule out other third party kernel mode software possibly
> contributing to the issue.  A simple change to Cygwin works around
> the problem for me.
> 
> I don't know which Windows releases are affected by this.  I've only
> reproduced the problem (outside of Cygwin) with Wow64 processes
> running on 64-bit Windows 7.  I haven't yet tried elsewhere.
> 
> The problem appears to be a race condition involving concurrent
> calls to TerminateProcess() and ExitThread().  The example code
> below minimally mimics the threads created and exit process/thread
> calls that are performed when running Cygwin's false.exe.  The
> primary thread exits the process via TerminateProcess() ala
> pinfo::exit() in winsup/cygwin/pinfo.cc.  The secondary thread exits
> itself via ExitThread() ala Cygwin's signal processing thread
> function, wait_sig(), in winsup/cygwin/sigproc.cc.
> 
> When the race condition results in the undesirable outcome, the exit
> code for the process is set to the exit code for the secondary
> thread's call to ExitThread().  I can only speculate at this point,
> but my guess is that the TerminateProcess() code disassociates the
> calling thread from the process before other threads are stopped
> such that ExitThread(), concurrently running in another thread, may
> determine that the calling thread is the last thread of the process
> and overwrite the process exit code.
> 
> The issue also reproduces if ExitProcess() is called in place of
> TerminateProcess().  The test case below only uses
> TerminateProcess() because that is what Cygwin does.
> 
> Source code to reproduce the issue follows.  Again, Cygwin is not
> required to reproduce the problem.  For my own testing, I compiled
> the code using Microsoft's Visual Studio 2010 x86 compiler with the
> command 'cl /Fetest-exit-code.exe test-exit-code.cpp'
> 
> test-exit-code.cpp:

Wow.  Thanks for this testcase.  I tried to reproduce the issue and
I was not able to reprodsuce it on a single-CPU, single-core setup,
but I could reproduce it almost immediately on a dual-core system,
twice in a row in under 5 secs.

> The workaround I implemented within Cygwin was simple and sloppy.  I
> added a call to Sleep(1000) immediately before the call to
> ExitThread() in wait_sig() in winsup/cygwin/sigproc.cc.  Since this
> thread (probably) doesn't exit until the process is exiting anyway,
> the call to Sleep() does not adversely affect shutdown.  The thread
> just gets terminated while in the call to Sleep() instead of exiting
> before the process is terminated or getting terminated while still
> in the call to ExitThread().  A better solution might be to avoid
> the thread exiting at all (so long as it can't get terminated while
> holding critical resources), or to have the process exiting thread
> wait on it.  Neither of these is ideal.  Orderly shutdown of
> multi-threaded processes is really hard to do correctly on Windows.
> 
> Since the exit code for the signal processing thread is not used,
> having the wait_sig() thread (and any other threads that could
> potentially concurrently exit with another thread) exit with a
> special status value such as STATUS_THREAD_IS_TERMINATING
> (0xC000004BL) would enable diagnosis of this issue as any process
> exit code matching this would be a likely indicator that this issue
> was encountered.

Maybe the signal thread should really not exit by itself, but just
wait until the TerminateThread is called.  Chris?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019