Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <4272C4EE.9080208@agilent.com> Date: Fri, 29 Apr 2005 16:36:14 -0700 From: Earl Chew User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Failure fork/exec/exec vs fork/exec/fork/exec Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 29 Apr 2005 23:36:19.0014 (UTC) FILETIME=[3E517E60:01C54D14] We've come across a very subtle problem where child processes will fail after some time. On our hyperthreaded systems, the child process fails by consuming one thread, and degrading the system (typically locking up the desktop) and power-off is the only recovery. We use cygwin to provide a build system wrapper. Originally the implementation would perform some configuration, then exec() the build supervisor, and the build would proceed. bash -> fork/exec -> wrapper -> exec -> supervisor -> etc This scenario fails as described above. The failure occurs after some variable time, but typically within 10s of minutes. We've noticed a couple of ways to workaround the problem: cmd.exe -> wrapper -> exec -> supervisor -> etc bash -> fork/exec -> wrapper -> fork/exec -> supervisor -> etc bash -> fork/exec -> wrapper -> spawn(_P_WAIT) -> supervisor -> etc By experiment, it seems that the key to the failure is the sequence of exec/exec, but I do not know how that corrupts the system so badly that power off is the only recourse. Earl -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/