Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-Id: <200412140556.iBE5uxBe029829@main.electric-cloud.com> From: "Conrad W Song" To: Subject: tty_list::terminate() tty_master NULL pointer deference Date: Mon, 13 Dec 2004 21:54:52 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Spam-Not-Checked: Messages over 100K or from internal Electric Cloud machines are not checked Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id iBE5vHeq018185 Running under Windows 2000 Server, using cygwin1.dll versions 1.5.11.1 and 1.5.12.1, I have seen bash based “sh -c gcc” commands fail in AS.exe (gas), under repeated invocation from a non-cygwin program.  Quick information about the run: CYGWIN=tty binmode ntsec nosmbntsec (tty is the important one) AS.exe exit code = 0x80 (Windows ERROR_WAIT_NO_CHILDREN : a direct result of AS.exe being spawned from gcc, and failing with a null-pointer deference). strace logging was unable to reproduce the problem (too slow – 3x for me?), but using custom logging turned out the following possiblity: There is a race/bug between ‘tty_list::allocate_tty()’ and ‘tty_list::terminate()’.  The cuprits appear to be: 1) ‘tty_list::terminate’ does not hold the ‘tty_mutex’ before freeing a tty 2) ‘tty::init()’ does not clear ‘master_pid’ field as called by ‘tty_list::terminate()’. The result is that ‘allocate_tty()’ can enter quickly reusing the same PID as the process which used to own the master tty (for some reason Windows decides to recycle PIDs very quickly).  The process thinks that it is holding the master tty (even though the _prior_ process was terminated -- same PID but different process), and ‘tty_master’ remains NULL.  Upon ‘tty_list::terminate’, ‘tty_master’ is then NULL dereferenced. I have tried fixing 'tty::init()' to zero out the 'master_pid' field, and it appears to solve the problem (have not checked it for other bad behavior), as does the workaround of using ‘CYGWIN=notty’.  However, I do not believe that the back-to-back PID reuse is timing sensitive, so I am surprised that strace could not pick up the problem. I therefore suspect a flaw in my analysis and am still suspicious about the need for 'tty_mutex' locking in 'tty_list::terminate'.  I will try to provide a reproducing test case soon. Thanks, Conrad -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/