X-Authentication-Warning: acp3bf.physik.rwth-aachen.de: broeker owned process doing -bs Date: Wed, 7 Feb 2001 16:08:53 +0100 (MET) From: Hans-Bernhard Broeker X-Sender: broeker AT acp3bf To: Eli Zaretskii cc: djgpp AT delorie DOT com Subject: Re: Function and File ordering and speed optimization In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Reply-To: djgpp AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: djgpp AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk On Wed, 7 Feb 2001, Eli Zaretskii wrote: > Keeping frequently-used code together improves the locality of the code, > exactly as referencing a large array in order improves the locality of > data. When your program or your OS pages, this really makes a > difference. I think it only is going to make a noticeable difference if the active code set is already very close (within a few 4KB pages) to the amount of free memory. In such a situation, reducing the memory footprint might indeed help to avoid paging altogether, and thus speed up things. But I doubt there are enough programs that fulfill these conditions to be worth bothering about this type of optimization. And a virtual-memory multitasking environment will actually reduce the probability of this making a big difference, IMHO: it continuously changes the effective amount of available RAM, and therefore the probability of the reduction of active pages in the program having a positive effect. Let's face it: once the system starts to page stuff in and out on a somewhat regular basis, all optimization is moot. Overall speed is determined almost entirely by the hard disk, then, no matter how efficient your programs are. > > The original motivation for the function ordering offered by gprof, IIRC, > > is for processors where the cost of a jump varies strongly with its > > distance. In segmented x86 operation modes, e.g., it could pay off to > > reorder functions if it allowed short jumps and calls instead of far ones. > > Does this hold on IA64 as well? I don't have the slightest idea. IA64 is supposed to be 100% different from everything we ever learned about the x86 (now a.k.a. IA32) series. But in general, I think that the more 'indirection' the CPU applies to execute the input machine code, i.e. the more caches, instruction re-interpretation by microcode, register renaming and what have you is done, the smaller the effect of function ordering is going to be. The assumptions upon which its effects rely are just too easy to break by such mechanisms. I expect it to make much more of a difference in, say, micro controllers or DSP's than in any modern full-blown computer. -- Hans-Bernhard Broeker (broeker AT physik DOT rwth-aachen DOT de) Even if all the snow were burnt, ashes would remain.