Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com X-Authenticated: #14308112 Date: Sun, 28 Aug 2005 14:28:13 +0300 From: Pavel Tsekov X-X-Sender: ptsekov AT mordor cc: cygwin AT cygwin DOT com Subject: Re: zsh: command not found => hangs In-Reply-To: Message-ID: References: <20050823205318 DOT GF6716 AT bouh DOT ens-lyon DOT fr> <20050825001137 DOT GI7338 AT bouh DOT ens-lyon DOT fr> <20050825220454 DOT GR7662 AT bouh DOT ens-lyon DOT fr> <20050826005349 DOT GA4087 AT trixie DOT casa DOT cgf DOT cx> <20050826191429 DOT GA2034 AT trixie DOT casa DOT cgf DOT cx> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Y-GMX-Trusted: 0 X-IsSubscribed: yes Hello, I did trace this problem an it looks like a race in Cygwin but I may be wrong. Here I am providing two backtraces from a debug session - the first one shows what happens normally and the second one shows the hang: === In this case zsh doesn't hang === (gdb) break 593 Breakpoint 1 at 0x10049315: file /home/ptsekov/src/zsh-4.2.4/Src/signals.c, line 593. (gdb) c Continuing. [Switching to thread 1708.0x1a8] Breakpoint 1, zhandler (sig=20) at /home/ptsekov/src/zsh-4.2.4/Src/signals.c:601 601 } /* handler */ (gdb) n 0x6109259c in sigreturn () at ../../../../src/winsup/cygwin/cygtls.h:239 239 _myfault = NULL; Current language: auto; currently c++ (gdb) set_process_mask (newmask=524288) at ../../../../src/winsup/cygwin/exceptions.cc:912 912 { (gdb) 913 set_signal_mask (newmask, myself->getsigmask ()); (gdb) 914 } (gdb) 0x610925b1 in sigreturn () at ../../../../src/winsup/cygwin/cygtls.h:239 239 _myfault = NULL; (gdb) signal_suspend (sig=20, sig2=0) at /home/ptsekov/src/zsh-4.2.4/Src/signals.c:401 401 } Current language: auto; currently c (gdb) zwaitjob (job=-1, sig=0) at /home/ptsekov/src/zsh-4.2.4/Src/jobs.c:1152 1152 if (subsh) { === End === === In this case zsh hangs === Breakpoint 1, zhandler (sig=20) at /home/ptsekov/src/zsh-4.2.4/Src/signals.c:601 601 } /* handler */ (gdb) n _cygtls::call_signal_handler (this=0x22f074) at ../../../../src/winsup/cygwin/exceptions.cc:1227 1227 incyg++; Current language: auto; currently c++ (gdb) 1228 set_signal_mask (this_oldmask, myself->getsigmask ()); (gdb) 1229 if (this_errno >= 0) (gdb) 1213 while (sig) (gdb) n 1233 return this_sa_flags & SA_RESTART; (gdb) 1234 } (gdb) 0x610926e4 in stabilize_sig_stack () at ../../../../src/winsup/cygwin/cygtls.h:239 239 _myfault = NULL; (gdb) pthread::is_good_object (thread=0x22ed0c) at ../../../../src/winsup/cygwin/thread.cc:234 234 return true; (gdb) 235 } (gdb) cancelable_wait (object=0x350, timeout=4294967295, cancel_action=cw_cancel_self, sig_wait=cw_sig_nosig) at ../../../../src/winsup/cygwin/thread.cc:830 830 cancel_n = WAIT_OBJECT_0 + num++; (gdb) 831 wait_objects[cancel_n] = thread->cancel_event; (gdb) 835 if (sig_wait == cw_sig_nosig || &_my_tls != _main_tls) (gdb) 836 sig_n = (DWORD) -1; (gdb) 845 res = WaitForMultipleObjects (num, wait_objects, FALSE, timeout); (gdb) (gdb) bt #0 cancelable_wait (object=0x350, timeout=4294967295, cancel_action=cw_cancel_self, sig_wait=cw_sig_nosig) at ../../../../src/winsup/cygwin/thread.cc:845 #1 0x6101cac8 in handle_sigsuspend (tempmask=4294443006) at ../../../../src/winsup/cygwin/exceptions.cc:599 #2 0x6109527f in sigsuspend (set=0x22ed9c) at ../../../../src/winsup/cygwin/signal.cc:450 #3 0x6109254f in _sigfe () at ../../../../src/winsup/cygwin/cygtls.h:239 #4 0x0022eda8 in ?? () #5 0xfff7fffe in ?? () #6 0x00000001 in ?? () #7 0x10120234 in ?? () #8 0x0022edd8 in ?? () #9 0x100280e4 in zwaitjob (job=852, sig=0) at /home/ptsekov/src/zsh-4.2.4/Src/jobs.c:1145 #10 0x00000000 in ?? () from === End === First I determined (by using strace) the place in zsh where the hang occurs to be in or after zhandler() when it processes SIGCLD. Then I built debugging zsh and after debugging for a while it turned out that the hang occures after leaving the signal handler. P.S. While looking at this I noticed that Cygwin's wait family of functions won't return 0 if WNOHANG is passed and no children are found that match the wait criteria - JFYI. Hope this info helps. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/