delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2001/02/07/12:48:15

X-Authentication-Warning: acp3bf.physik.rwth-aachen.de: broeker owned process doing -bs
Date: Wed, 7 Feb 2001 16:08:53 +0100 (MET)
From: Hans-Bernhard Broeker <broeker AT physik DOT rwth-aachen DOT de>
X-Sender: broeker AT acp3bf
To: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
cc: djgpp AT delorie DOT com
Subject: Re: Function and File ordering and speed optimization
In-Reply-To: <Pine.SUN.3.91.1010207083723.2148M-100000@is>
Message-ID: <Pine.LNX.4.10.10102071554500.4137-100000@acp3bf>
MIME-Version: 1.0
Reply-To: djgpp AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

On Wed, 7 Feb 2001, Eli Zaretskii wrote:

> Keeping frequently-used code together improves the locality of the code, 
> exactly as referencing a large array in order improves the locality of 
> data.  When your program or your OS pages, this really makes a 
> difference.

I think it only is going to make a noticeable difference if the active
code set is already very close (within a few 4KB pages) to the amount of
free memory. In such a situation, reducing the memory footprint might
indeed help to avoid paging altogether, and thus speed up things. But I
doubt there are enough programs that fulfill these conditions to be worth
bothering about this type of optimization.

And a virtual-memory multitasking environment will actually reduce the
probability of this making a big difference, IMHO: it continuously changes
the effective amount of available RAM, and therefore the probability of
the reduction of active pages in the program having a positive effect.

Let's face it: once the system starts to page stuff in and out on a
somewhat regular basis, all optimization is moot. Overall speed is
determined almost entirely by the hard disk, then, no matter how efficient
your programs are.

> > The original motivation for the function ordering offered by gprof, IIRC,
> > is for processors where the cost of a jump varies strongly with its
> > distance. In segmented x86 operation modes, e.g., it could pay off to
> > reorder functions if it allowed short jumps and calls instead of far ones.
> 
> Does this hold on IA64 as well?

I don't have the slightest idea. IA64 is supposed to be 100% different
from everything we ever learned about the x86 (now a.k.a. IA32) series.

But in general, I think that the more 'indirection' the CPU applies to
execute the input machine code, i.e. the more caches, instruction
re-interpretation by microcode, register renaming and what have you is
done, the smaller the effect of function ordering is going to be. The
assumptions upon which its effects rely are just too easy to break by such
mechanisms.

I expect it to make much more of a difference in, say, micro controllers
or DSP's than in any modern full-blown computer.

-- 
Hans-Bernhard Broeker (broeker AT physik DOT rwth-aachen DOT de)
Even if all the snow were burnt, ashes would remain.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019