delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2012/12/07/14:55:32

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=2.8 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_NO,RCVD_IN_HOSTKARMA_W,RCVD_IN_HOSTKARMA_WL,URI_HEX
X-Spam-Check-By: sourceware.org
X-Forefront-Antispam-Report: CIP:157.56.238.5;KIP:(null);UIP:(null);IPV:NLI;H:BY2PRD0512HT004.namprd05.prod.outlook.com;RD:none;EFVD:NLI
X-SpamScore: 6
X-BigFish: PS6(zzd6eahzz1de0h1202h1e76h1d1ah1d2ahzz177df4h17326ah8275bhb412mz32i2a8h668h839h947hd25he5bhf0ah1288h12a5h12a9h12bdh137ah13b6h1441h1504h1537h153bh162dh1631h1758h1765h1155h)
Message-ID: <50C2498C.2000003@coverity.com>
Date: Fri, 7 Dec 2012 14:54:52 -0500
From: Tom Honermann <thonermann AT coverity DOT com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120714 Thunderbird/14.0
MIME-Version: 1.0
To: <cygwin AT cygwin DOT com>
Subject: Intermittent failures retrieving process exit codes
X-OriginatorOrg: coverity.com
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

I've witnessed intermittent failures in multiple build systems while 
working at multiple companies using Cygwin bash and make as part of the 
build system but using non-Cygwin compilers and other tools.  The 
intermittent failures occur when a process appears to complete 
successfully, but the process retrieving its exit code receives an 
unexpected value.  This has been seen on many different Cygwin versions 
across several years.

Several reports of similar sounding issues can be found online:
- 
http://cygwin.1069669.n5.nabble.com/Cygwin-1-7-x-on-Windows-7-Exit-statuses-of-Win32-executables-are-sometimes-wrong-td20186.html
- 
http://stackoverflow.com/questions/9769256/intermittent-failures-under-cygwin-possibly-related-to-candle-and-or-make

I recently was able to produce a very small test case that reproduces 
this issue reliably on some machines:

$ cat test.sh
#!/bin/sh

while [ 1 ]; do
   echo "test..."
   if cmd /c "false"; then
     echo "exiting..."
     exit 1
   fi
done

An invocation of test.sh should run indefinitely, but fails very quickly 
on one of my machines:

$ ./test.sh
test...
test...
exiting...

$ ./test.sh
test...
test...
test...
test...
exiting...

$ ./test.sh
test...
exiting...

There are several high-level possibilities for what is going wrong:

1) cmd.exe is failing to retrieve the correct exit code for the 
invocation of false.exe (A Cygwin process).

2) cmd.exe is failing to return the (correct) exit code it received for 
the invocation of false.exe.

3) bash.exe (A Cygwin process) is failing to retrieve the correct exit 
code for the invocation of cmd.exe.

It is possible that other software installed on the machines I've 
witnessed this on are contributing to the problem (ala 
http://cygwin.com/faq/faq.using.html#faq.using.bloda).  If so, such 
software would be a contributing factor to one of the explanations 
above, but does not necessarily mean that there is not a defect in 
Cygwin (or CreateProcess, WaitForSingleObject, or GetExitCodeProcess). 
I have not yet seen a similar case that does not involve Cygwin, so at 
present I suspect a defect in Cygwin, but possibly one that produces no 
negative symptoms in isolation.

I've reproduced this issue with both the 32-bit and 64-bit versions of 
cmd.exe.  I've also reproduced it by replacing cmd.exe with a C file 
that calls CreateProcess for Cygwin's false.exe on its own.  The issue 
reproduces whether that C file is compiled with Cygwin gcc, MinGW gcc 
(32-bit and 64-bit), and with MSVC (32-bit and 64-bit).  So, substitute 
what you like for 'cmd.exe' in the above.

Likewise, I've reproduced this issue by replacing false.exe in the test 
above with a custom false.exe (A C program that just returns 1).  The 
issue reproduces whether myfalse.exe is compiled with Cygwin gcc, MinGW 
gcc (32-bit and 64-bit), and with MSVC (32-bit and 64-bit).  So, 
substitute what you like for 'false.exe' in the above.

I am not able to reproduce the problem if I elide the invocation of 
false.exe.  (ie, if the cmd.exe invocation is 'cmd /c "exit /B 1"' or if 
my replacement for cmd.exe just returns 1).

The problem feels like a race condition in retrieving process exit 
codes.  Further, it seems that it may only occur when two related 
processes exit in quick succession.

I've been granted several weeks in the near future to work exclusively 
on this issue.  Before I start working on it though, I'd like to hear 
from other community members who have experienced this and tried to 
debug it.  What is and is not known about the issue.  What workarounds 
have been tried (especially any that were found to be successful).  Are 
there specific parts of the Cygwin (or bash) code that you recommend 
starting with?

The machine that I've been running the above script on is 64-bit Windows 
7 Professional SP1 running under VMware Workstation 8 which is running 
on Kubuntu 12.04.

Relevant parts of 'cygcheck-s' are:

Windows 7 Professional N Ver 6.1 Build 7601 Service Pack 1

Running under WOW64 on AMD64

     Cygwin DLL version info:
         DLL version: 1.7.16
         DLL epoch: 19
         DLL old termios: 5
         DLL malloc env: 28
         Cygwin conv: 181
         API major: 0
         API minor: 262
         Shared data: 5
         DLL identifier: cygwin1
         Mount registry: 3
         Cygwin registry name: Cygwin
         Program options name: Program Options
         Installations name: Installations
         Cygdrive default prefix:
         Build date:
         Shared id: cygwin1S5


Potential app conflicts:

ByteMobile laptop optimization client.

No Cygwin services found.

Cygwin Package Information
Package                    Version              Status
bash                       4.1.10-4             OK
cygwin                     1.7.16-1             OK


Tom.


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019