X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:mime-version:reply-to:from:date:message-id :subject:to:content-type; q=dns; s=default; b=grYeRe6UdGPdy+5oMB TifYIAbbeUVfx2FQE/hFcBowoGpRbIJnzjB70XIp7yTMVlP3x47idT/jf0EVVWmB jxbSS4PPhLXpS06xrKmSymLDnp3qI+ac+ItbTr+ImFd8sjwYZ63Z86ytAgI5VmZx j3h8S5ICMkaNrtFrq3juG1DX4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:mime-version:reply-to:from:date:message-id :subject:to:content-type; s=default; bh=gESWD1+MgGhL/CIZH30XfraL tJ0=; b=oeYXxEVfd9u7Y++rDQWZwnluYn+qec6mJc+cVHmtG6m4hWJLMmRt/qKc kwKRJj95pHf8mAHT6mWOZ8xfUopltKkMfIb0joPkvQY010pV0IJStGuOXEDIygcC FzTvMpbztvI0DQy/flpsdr1XQeEns4MUS7FKCm8vYUm0HyfWA+M= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wi0-f172.google.com X-Received: by 10.180.9.74 with SMTP id x10mr4115201wia.61.1443792480729; Fri, 02 Oct 2015 06:28:00 -0700 (PDT) MIME-Version: 1.0 Reply-To: fracting AT gmail DOT com From: Qian Hong Date: Fri, 2 Oct 2015 21:27:20 +0800 Message-ID: Subject: Proposal to use ThreadQuerySetWin32StartAddress inside munge_threadfunc (Cygwin randomly crashes on Wine) To: cygwin Content-Type: multipart/mixed; boundary=001a11c245faf2030705211f21c7 X-IsSubscribed: yes --001a11c245faf2030705211f21c7 Content-Type: text/plain; charset=UTF-8 Dear Cygwin developers, While testing Cygwin on Wine, there is a random crashing puzzled me for a wrong time. The easiest way to reproduce it on my machine is: 1. Install latest Wine (staging version) from http://www.wine-staging.com/ 2. Install latest Cygwin on Wine $ uname -a CYGWIN_NT-5.1 2.2.1(0.289/5/3) 2015-08-20 11:40 i686 Cygwin 3. run curl to fetch some non existent url, like $ curl 127.0.0.2 This reproduce the crashing almost 100% gdb provides pretty good backtrace like below: (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /usr/bin/curl 127.0.0.2 [New Thread 209146.0x330fb] [New Thread 209146.0x330fc] [New Thread 209146.0x330fd] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 209146.0x330fd] 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () #1 0x6100609b in _cygtls::call2(unsigned long (*)(void*, void*), void*, void*)@16 (this=0xa2ce64, func=func AT entry=0x0, arg=arg AT entry=0x611b6c70 , buf=buf AT entry=0xa2b824) at /usr/src/debug/cygwin-2.2.1-1/winsup/cygwin/cygtls.cc:111 #2 0x61006151 in _cygtls::call (func=0x0, arg=0x611b6c70 ) at /usr/src/debug/cygwin-2.2.1-1/winsup/cygwin/cygtls.cc:30 #3 0x61088a58 in threadfunc_fe (arg=) at /usr/src/debug/cygwin-2.2.1-1/winsup/cygwin/init.cc:32 #4 0x7bc90da0 in call_thread_func_wrapper+0xc in ntdll (1) #5 0x7bc90e1c in call_thread_func+0x72 [/home/fracting/src/wine-patched/dlls/ntdll/signal_i386.c:3017] in ntdll (1) #6 0x7bc90d7e in call_thread_entry_point+0x12 in ntdll (1) #7 0x7bc9938a in ?? ()RtlCreateUserThread [/home/fracting/src/wine-patched/dlls/ntdll/thread.c:480] in ntdll (1) #8 0xb7525f16 in ?? () #9 0xb745c11e in ?? () (gdb) The hard way to reproduce it on my machine is to build some large project (like cygwin itself) on cygwin on top of wine, and some process will crashing randomly, the failure rate is very low, which puzzled me a lot ( I did verify they are the same bug by testing several different kinds of workaround ). However, not every Wine user can reproduce this bug, even with the "easy way". I also can't reproduce this bug with strace. After investigation, we found the problem is related to munge_threadfunc: 1. When a new cygwin thread is created, init.cc:dll_entry() is called with DLL_THREAD_ATTACH, which calls munge_threadfunc(). 2. Inside munge_threadfunc(), cygwin search for the address of cygthread::stub() in order to determine the thread entry point, and then try to patch the thread entry point and its copies in stack frame to a wrapper function called threadfunc_fe(). 3. According to my test, the searching result on Windows is always as expected. 4. However, when testing on Wine, we found the searching result is not reliable. Sometimes threadfunc_ix[0] point to Wine's mod_name inside dlls/ntdll/loader.c:MODULE_InitDLL(), which is called everytime a thread is initializing. This looks unexpected, since the original purpose of munge_threadfunc() is to find the thread entry point, but on Wine some garbage data in memory happens to be equal the the address of thread entry point, so munge_threadfunc() found the wrong address and fill the wrong offset to threadfunc_ix[]. 5. Since the offset might be wrong, ebp[threadfunc_ix[0]] is sometimes changed to unexpected data on Wine, so "TlsSetValue (_my_oldfunc, threadfunc)" store the wrong data and "TlsGetValue (_my_oldfunc)" get the wrong data, which makes Cygwin crashing randomly. We have a simple hack in the Wine side which makes Cygwin happy, which is attached as 0001-ntdll-Initialize-mode_name-to-zero.txt. The reason this hack works is, by filling the array mod_name by zero, it won't contain garbage data which confuse munge_threadfunc() anymore. However, this hack is ugly for Wine and not reliable for all compilers and or all compiler options. When people build wine with a newer/older compiler or with different optimization levels, the offsets might be slightly different again, and the problem would reappear. Alternative, we also tried a hack in the Cygwin side, which use ThreadQuerySetWin32StartAddress to query the thread entry point, as 0001-hack-use-ThreadQuerySetWin32StartAddress.txt show. I tested this hack with recent Cygwin git repo and confirming it works for me (without hack from Wine side). I also tested my own cygwin build with this hack on Windows to confirm it doesn't break things. Is the proposal way accepted by Cygwin? I understand we hate changing working code (on Windows), but using ThreadQuerySetWin32StartAddress seems like an improvement than rely on searching result from stack memory. If we could discuss a solution which makes both Cygwin and Wine happy that would be great. MSDN says, "Note that on versions of Windows prior to Windows Vista, the returned start address is only reliable before the thread starts running.". Actually I tested my build on Windows XP sp2 and it works for me. Additional, since Cygwin is moving to the end of Windows XP support, maybe we are at the right time to do this change. Any comment is great appreciated! cross-reference: https://bugs.wine-staging.com/show_bug.cgi?id=561 [1] https://github.com/wine-compholio/wine-patched/blob/5dee89ca82c36bf191ce3e26011b82dc87a42d4a/dlls/ntdll/loader.c#L1150 -- Regards, Qian Hong - http://www.winehq.org --001a11c245faf2030705211f21c7 Content-Type: text/plain; charset=US-ASCII; name="0001-ntdll-Initialize-mod_name-to-zero.txt" Content-Disposition: attachment; filename="0001-ntdll-Initialize-mod_name-to-zero.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_if9h1hkz0 RnJvbSBhMjQ1ZTczMDZjNzg1NWUwOTE1OWM4OTcwZmMyZTZhOWJhODhkZDRj IE1vbiBTZXAgMTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBRaWFuIEhvbmcgPHFo b25nQGNvZGV3ZWF2ZXJzLmNvbT4KRGF0ZTogV2VkLCA5IFNlcCAyMDE1IDA1 OjMxOjE4ICswODAwClN1YmplY3Q6IFtQQVRDSF0gbnRkbGw6IEluaXRpYWxp emUgbW9kX25hbWUgdG8gemVyby4KCi0tLQogZGxscy9udGRsbC9sb2FkZXIu YyB8IDYgKysrKysrCiAxIGZpbGUgY2hhbmdlZCwgNiBpbnNlcnRpb25zKCsp CgpkaWZmIC0tZ2l0IGEvZGxscy9udGRsbC9sb2FkZXIuYyBiL2RsbHMvbnRk bGwvbG9hZGVyLmMKaW5kZXggZjk4ZTBiMy4uN2FjNDRhMyAxMDA2NDQKLS0t IGEvZGxscy9udGRsbC9sb2FkZXIuYworKysgYi9kbGxzL250ZGxsL2xvYWRl ci5jCkBAIC0xMTk0LDYgKzExOTQsMTIgQEAgc3RhdGljIE5UU1RBVFVTIE1P RFVMRV9Jbml0RExMKCBXSU5FX01PRFJFRiAqd20sIFVJTlQgcmVhc29uLCBM UFZPSUQgbHBSZXNlcnZlZAogICAgIGlmICh3bS0+bGRyLlRsc0luZGV4ICE9 IC0xKSBjYWxsX3Rsc19jYWxsYmFja3MoIHdtLT5sZHIuQmFzZUFkZHJlc3Ms IHJlYXNvbiApOwogICAgIGlmICghZW50cnkgfHwgISh3bS0+bGRyLkZsYWdz ICYgTERSX0lNQUdFX0lTX0RMTCkpIHJldHVybiBTVEFUVVNfU1VDQ0VTUzsK IAorICAgIC8qCisgICAgbWVtc2V0KCBtb2RfbmFtZSwgMCwgMTYpOyAvL2Ny YXNoCisgICAgbWVtc2V0KCBtb2RfbmFtZSwgMCwgMTcpOyAvL3dvcmtzCisg ICAgKi8KKyAgICBtZW1zZXQoIG1vZF9uYW1lLCAwLCBzaXplb2YobW9kX25h bWUpICk7CisKICAgICBpZiAoVFJBQ0VfT04ocmVsYXkpKQogICAgIHsKICAg ICAgICAgc2l6ZV90IGxlbiA9IG1pbiggd20tPmxkci5CYXNlRGxsTmFtZS5M ZW5ndGgsIHNpemVvZihtb2RfbmFtZSktc2l6ZW9mKFdDSEFSKSApOwotLSAK Mi4xLjAKCg== --001a11c245faf2030705211f21c7 Content-Type: text/plain; charset=US-ASCII; name="0001-hack-use-ThreadQuerySetWin32StartAddress.txt" Content-Disposition: attachment; filename="0001-hack-use-ThreadQuerySetWin32StartAddress.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_if9i7btm1 ZGlmZiAtLWdpdCBhL3dpbnN1cC9jeWd3aW4vaW5pdC5jYyBiL3dpbnN1cC9j eWd3aW4vaW5pdC5jYwppbmRleCA1NmQ0NjY4Li5lYjgyMmE1IDEwMDY0NAot LS0gYS93aW5zdXAvY3lnd2luL2luaXQuY2MKKysrIGIvd2luc3VwL2N5Z3dp bi9pbml0LmNjCkBAIC01NSwxNCArNTUsMjMgQEAgbXVuZ2VfdGhyZWFkZnVu YyAoKQogCiAgIGlmICh0aHJlYWRmdW5jX2l4WzBdKQogICAgIHsKLSAgICAg IGNoYXIgKnRocmVhZGZ1bmMgPSBlYnBbdGhyZWFkZnVuY19peFswXV07Cisg ICAgICBjaGFyICp0aHJlYWRmdW5jMCA9IGVicFt0aHJlYWRmdW5jX2l4WzBd XTsKKyAgICAgIGNoYXIgKnRocmVhZGZ1bmM7CisgICAgICBEV09SRCByZXQ7 CisKKyAgICAgIE50UXVlcnlJbmZvcm1hdGlvblRocmVhZChHZXRDdXJyZW50 VGhyZWFkKCksIFRocmVhZFF1ZXJ5U2V0V2luMzJTdGFydEFkZHJlc3MsICZ0 aHJlYWRmdW5jLCBzaXplb2YodGhyZWFkZnVuYyksICZyZXQpOworCisgICAg ICBzeXNjYWxsX3ByaW50ZigidGhyZWFkZnVuYzAgJXBcbiIsIHRocmVhZGZ1 bmMwKTsKKyAgICAgIHN5c2NhbGxfcHJpbnRmKCJ0aHJlYWRmdW5jICVwXG4i LCB0aHJlYWRmdW5jKTsKKyAgICAgIHN5c2NhbGxfcHJpbnRmKCJzZWFyY2hf Zm9yICVwXG4iLCBzZWFyY2hfZm9yKTsKICAgICAgIGlmICghc2VhcmNoX2Zv ciB8fCB0aHJlYWRmdW5jID09IHNlYXJjaF9mb3IpCi0JewotCSAgc2VhcmNo X2ZvciA9IE5VTEw7Ci0JICBmb3IgKGkgPSAwOyB0aHJlYWRmdW5jX2l4W2ld OyBpKyspCi0JICAgIGVicFt0aHJlYWRmdW5jX2l4W2ldXSA9IChjaGFyICop IHRocmVhZGZ1bmNfZmU7Ci0JICBUbHNTZXRWYWx1ZSAoX215X29sZGZ1bmMs IHRocmVhZGZ1bmMpOwotCX0KKyAgICAgICAgeworICAgICAgICAgIHNlYXJj aF9mb3IgPSBOVUxMOworICAgICAgICAgIGZvciAoaSA9IDA7IHRocmVhZGZ1 bmNfaXhbaV07IGkrKykKKyAgICAgICAgICAgIGlmIChlYnBbdGhyZWFkZnVu Y19peFtpXV0gPT0gdGhyZWFkZnVuYykKKyAgICAgICAgICAgICAgIGVicFt0 aHJlYWRmdW5jX2l4W2ldXSA9IChjaGFyICopIHRocmVhZGZ1bmNfZmU7Cisg ICAgICAgICAgVGxzU2V0VmFsdWUgKF9teV9vbGRmdW5jLCB0aHJlYWRmdW5j KTsKKyAgICAgICAgfQogICAgIH0KIH0KIApkaWZmIC0tZ2l0IGEvd2luc3Vw L2N5Z3dpbi9udGRsbC5oIGIvd2luc3VwL2N5Z3dpbi9udGRsbC5oCmluZGV4 IDEzYTEzMWQuLjA1MGU4NDggMTAwNjQ0Ci0tLSBhL3dpbnN1cC9jeWd3aW4v bnRkbGwuaAorKysgYi93aW5zdXAvY3lnd2luL250ZGxsLmgKQEAgLTExNjIs NyArMTE2Miw4IEBAIHR5cGVkZWYgZW51bSBfVEhSRUFESU5GT0NMQVNTCiB7 CiAgIFRocmVhZEJhc2ljSW5mb3JtYXRpb24gPSAwLAogICBUaHJlYWRUaW1l cyA9IDEsCi0gIFRocmVhZEltcGVyc29uYXRpb25Ub2tlbiA9IDUKKyAgVGhy ZWFkSW1wZXJzb25hdGlvblRva2VuID0gNSwKKyAgVGhyZWFkUXVlcnlTZXRX aW4zMlN0YXJ0QWRkcmVzcyA9IDkKIH0gVEhSRUFESU5GT0NMQVNTLCAqUFRI UkVBRElORk9DTEFTUzsKIAogLyogQ2hlY2tlZCBvbiA2NCBiaXQuICovCg== --001a11c245faf2030705211f21c7 Content-Type: text/plain; charset=us-ascii -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple --001a11c245faf2030705211f21c7--