X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,TW_RX,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org Message-ID: <4E1AAF38.9040100@gmail.com> Date: Mon, 11 Jul 2011 11:07:20 +0300 From: yoni levi User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11 MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Random fork failures Content-Type: text/plain; charset=windows-1255; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Hi, We have a problem with fork for a very long time. From time to time, fork just hangs (with 2 process doing busy loop). The problem occurs when we do allot of spawns (make system). It is very easy to recreate the problem, but unfortunately, there is too much code involve to send. I did a little investigation - when the process hangs, I took the backtrace with process explorer. Process explorer does not know anything about cygwin debug symbols, so it just give an arbitrary func + offset. Then I used gdb to find out the real function: sync_proc_pipe doing yield in an endless loop. It seems that there is a race condition here, since this error depends on timing. when the CPU is very busy (e.g. when I do while((1)); do true; done X 4 times), the frequency of the problem is reduced. I dug a little bit in the mailing list and found many references to fork failure. I belive this one is similar to my problem - http://cygwin.com/ml/cygwin/2011-04/msg00066.html I also saw all the work done around fork lately (http://old.nabble.com/Re%3A-Improvements-to-fork-handling-td31594702.html) so I tried to use the latest cygwin from CVS. This acctualy was very helpful, and this error disappear. the problem is that many other errors/crashes are introduced! Many perl scripts stopped working, rxvt crashes from time to time, gcc fails with no good reason and more. I will appreciate any advice, how can I solve this problem. Thanks, Yoni. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple