Date: Wed, 7 Feb 2001 08:40:39 +0200 (IST) From: Eli Zaretskii X-Sender: eliz AT is To: Hans-Bernhard Broeker cc: djgpp AT delorie DOT com Subject: Re: Function and File ordering and speed optimization In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Reply-To: djgpp AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: djgpp AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk On Tue, 6 Feb 2001, Hans-Bernhard Broeker wrote: > > ??? One thing I'd do is to group functions which are called many > > times together. This would maximize the probability that they are in > > the L1 cache most of the time. > > The things you might be missing would be that the L1 cache is a dynamic > beast, and that the chunks it caches as one piece of memory ("cache > lines") are small compared to the average size of an average function. > I.e. you'll hardly ever fit two or more functions into a single cache > line. Sorry, the cache is indeed not the issue. But the resident set in a virtual-memory environment, especially on Windows, _is_ and issue, because 4KB, the size of a page, can hold quite a lot of code. Keeping frequently-used code together improves the locality of the code, exactly as referencing a large array in order improves the locality of data. When your program or your OS pages, this really makes a difference. > The original motivation for the function ordering offered by gprof, IIRC, > is for processors where the cost of a jump varies strongly with its > distance. In segmented x86 operation modes, e.g., it could pay off to > reorder functions if it allowed short jumps and calls instead of far ones. Does this hold on IA64 as well?