delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1998/09/21/18:44:34

Message-Id: <199809211602.RAA13931@rochefort.ns.easynet.net>
Comments: Authenticated sender is <mert0407 AT sable DOT ox DOT ac DOT uk>
From: "George Foot" <george DOT foot AT merton DOT ox DOT ac DOT uk>
To: Rylan <rylan AT intekom DOT co DOT za>
Date: Fri, 18 Sep 1998 19:28:12 +0000
MIME-Version: 1.0
Subject: Re: -O3 and -O2 breaks my NASM code
Reply-to: mert0407 AT sable DOT ox DOT ac DOT uk
CC: djgpp AT delorie DOT com

On 18 Sep 98 at 13:46, Rylan wrote:

> I've ran into a situation where attempting to use -O3 and even -O2 with ANY
> code that calls NASM compiled functions (in their own, seperate .O) compiles
> fine but crashes the moment the NASM code is reached. This happens without a
> stack trace, nothing - the whole program just stops. Unoptimized compiles of
> the same code runs 100%.
> 
> Any ideas why, and how I can get NASM code to coexist with optimised DJGPP
> code?

Are you obeying the calling conventions?  In particular, 
optimised code will expect a `call' not to affect any of the 
following registers:

    EBX, ESI, EDI, ESP, EBP, CS, DS, ES, SS

Preserving ESP is pretty normal of course, unless you do
something dodgy.  If you use EBP as a frame pointer, you need
to push its old value first and pop it back afterwards.
Otherwise it falls into the same category as all the others --
push the values of any of these that you change to the stack on
entry and pop them back (in the correct order of course) on
exit.

You can clobber EAX, ECX, EDX, FS and GS, and the FPU's stack.  
In fact EAX is often used to return values, sometimes along 
with EDX, and sometimes the FPU stack is used instead.  So of 
course you must be able to clobber these.

See http://users.ox.ac.uk/~mert0407/asmfuncs.txt for an
AT&T-style oriented description of this; of course, the same
rules apply to your NASM Intel-style code.

I suspect the reason your code works fine if you don't optimise 
is that gcc when not optimising makes far less assumptions 
about what is in each register.

For a quick example of why this matters, suppose gcc generates 
this code:

    movl   $10, %ebx
  1:
    call   _your_function
    decl   %ebx
    jnz    1b

It's using the fact that the function shouldn't clobber EBX.  
But, if you wrote that function and it does in fact set EBX to 
zero each time it executes, the loop will be endless.

-- 
george DOT foot AT merton DOT oxford DOT ac DOT uk

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019