Mailing-List: contact cygwin-developers-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-developers-owner AT cygwin DOT com Delivered-To: mailing list cygwin-developers AT cygwin DOT com Message-ID: <3D38C63B.1070201@hekimian.com> Date: Fri, 19 Jul 2002 22:08:59 -0400 X-Sybari-Trust: 114811ca b923d9bf 0879ee9b 00000109 From: Joe Buehler Reply-To: joseph DOT buehler AT spirentcom DOT com Organization: Spirent Communications User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.0.0) Gecko/20020530 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Cygwin Developers Subject: [bug found] Re: cygwin hang problem References: <3D32FC00 DOT 5090108 AT hekimian DOT com> <20020719050925 DOT GA24259 AT redhat DOT com> <3D37F0E5 DOT 50F3669B AT yahoo DOT com> <20020719141242 DOT GB27697 AT redhat DOT com> <3D38949C DOT 3090200 AT hekimian DOT com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit OK, I think I see what the problem may be. In the dll_func_load code (assembly language), the dll linkage code is patched (rewritten) once the address of the loaded dll function is known. The problem is that there is a race -- the new opcode and its argument are written separately. What happens is this: 1. a mov instruction is overwritten with 0xe9 to become a jmp 2. another thread executes the jmp before step 3 3. the newly written jmp instruction gets the proper offset written Since the mov instruction uses an offset from the beginning of the segment, and the jmp uses an EIP-relative offset, the net effect is that the jmp goes off in the weeds. The data in the dll linkage code is what causes the observed behavior of a jump to twice the value of the linkage data -- the mov instruction references memory just a few bytes further down. In the core that I am looking at, here is what is at win32_CopySid AT 12: 0x610f00b8: 0xa1 0xbf 0x00 0x0f 0x61 # mov 0x610f00bf,%eax This becomes -- at just the wrong moment: 0x610f00b8: 0xe9 0xbf 0x00 0x0f 0x61 # jmp %eip+0x610f00bf So the locking needs some changing in the dll linkage code. There is in fact a comment above dll_func_load that the code may not be thread safe! Joe Buehler