Message-Id: <199809211602.RAA13931@rochefort.ns.easynet.net> Comments: Authenticated sender is From: "George Foot" To: Rylan Date: Fri, 18 Sep 1998 19:28:12 +0000 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Subject: Re: -O3 and -O2 breaks my NASM code Reply-to: mert0407 AT sable DOT ox DOT ac DOT uk CC: djgpp AT delorie DOT com Precedence: bulk On 18 Sep 98 at 13:46, Rylan wrote: > I've ran into a situation where attempting to use -O3 and even -O2 with ANY > code that calls NASM compiled functions (in their own, seperate .O) compiles > fine but crashes the moment the NASM code is reached. This happens without a > stack trace, nothing - the whole program just stops. Unoptimized compiles of > the same code runs 100%. > > Any ideas why, and how I can get NASM code to coexist with optimised DJGPP > code? Are you obeying the calling conventions? In particular, optimised code will expect a `call' not to affect any of the following registers: EBX, ESI, EDI, ESP, EBP, CS, DS, ES, SS Preserving ESP is pretty normal of course, unless you do something dodgy. If you use EBP as a frame pointer, you need to push its old value first and pop it back afterwards. Otherwise it falls into the same category as all the others -- push the values of any of these that you change to the stack on entry and pop them back (in the correct order of course) on exit. You can clobber EAX, ECX, EDX, FS and GS, and the FPU's stack. In fact EAX is often used to return values, sometimes along with EDX, and sometimes the FPU stack is used instead. So of course you must be able to clobber these. See http://users.ox.ac.uk/~mert0407/asmfuncs.txt for an AT&T-style oriented description of this; of course, the same rules apply to your NASM Intel-style code. I suspect the reason your code works fine if you don't optimise is that gcc when not optimising makes far less assumptions about what is in each register. For a quick example of why this matters, suppose gcc generates this code: movl $10, %ebx 1: call _your_function decl %ebx jnz 1b It's using the fact that the function shouldn't clobber EBX. But, if you wrote that function and it does in fact set EBX to zero each time it executes, the loop will be endless. -- george DOT foot AT merton DOT oxford DOT ac DOT uk