Mail Archives: djgpp/1997/03/01/10:01:17
nikki (nikki AT gameboutique DOT co) writes:
>> A loop such as
>>
>> register int i,j;
>> for (i=0; i<15; i++) {
>> /* simple arithmetic */
>> for (j=0; j<21; j++) {
>> /* Function call and some math */
>> }
>> /* More math */
>> }
>
> hardly a great surprise seeing as the loop above would quite probably fit in
> the cache when well optimised, but unrolled would thrash it horribly.
> unrolling loops won't save an enormous amount of time, after all a jump
> instruction will only take you 3 or 4 cycles at most.
I was thinking of the jump, plus the test for end of loop.
As for caching, I'm looking to improve speed on a 486. Yes, one
of those pre-Pentium dinosaurs that are dwindling in population and are on
the endangerted species list along with such uncommon specimens as Amigas
and Macintoshes and the 8-bit Nintendo, but are not yet extinct. Some end
users of my program (not to mention the developer ;)) might have such
oddball museum-pieces laying around, with their lack of caches, and lack of
pipelining, and lack of certain machine instructions...;) I figger on any
reasonably recent Pentium, the speed of the program will be limited by the
builtin "brake" limiting it to twenty main loops (different loop! the loops
being unrolled are run in their entirety every main loop) more than by any
caching, lack thereof, or by those loops.
--
.*. Where feelings are concerned, answers are rarely simple [GeneDeWeese]
-() < When I go to the theater, I always go straight to the "bag and mix"
`*' bulk candy section...because variety is the spice of life... [me]
Paul Derbyshire ao950 AT freenet DOT carleton DOT ca, http://chat.carleton.ca/~pderbysh
- Raw text -