Mail Archives: djgpp/1997/03/04/09:52:07
Jeff Weeks <pweeks AT execulink DOT com> wrote in article
<5ffjqj$g5a AT nr1 DOT toronto DOT istar DOT net>...
> All this talk about unrolling loop and I feel like an idiot :) What
> exactly is invovled in the process of unrolling loops?
>
Loop unrolling is the simple optimisation process of taking a constant
iteration loop and reducing the number of iterations performed. For
example, you may have a routine that draws bitmapped 8x8 characters onto
the screen (assuming characters are stored as 8 rows of 8 pixels)...
unsigned char *source; /* Pointer to character bitmap data */
unsigned char *screen; /* Pointer to screen position where character will
be drawn */
int x, y; /* Loop variables */
unsigned char c; /* ASCII value to draw */
for (y=0; y<8; y++)
{
for (x=0; x<8; x++)
{
if (source[c * 64 + y * 8 + x] != 0)
*(screen + x) = source[c * 64 + y * 8 + x];
}
screen += SCREEN_WIDTH;
}
I know that is example is VERY unoptimised, but it is just an example after
all.
This example's inner loop executes 8 times a very simple statement. To
reduce the time taken performing branches and tests to see if the loop has
ended yet, we can simply execute the inner statement 8 times with
appropriate values already inserted.
for (y=0; y<8; y++)
{
if (source[c * 64 + y * 8 + 0] != 0)
*(screen + 0) = source[c * 64 + y * 8 + 0];
if (source[c * 64 + y * 8 + 1] != 0)
*(screen + 1) = source[c * 64 + y * 8 + 1];
if (source[c * 64 + y * 8 + 2] != 0)
*(screen + 2) = source[c * 64 + y * 8 + 2];
if (source[c * 64 + y * 8 + 3] != 0)
*(screen + 3) = source[c * 64 + y * 8 + 3];
if (source[c * 64 + y * 8 + 4] != 0)
*(screen + 4) = source[c * 64 + y * 8 + 4];
if (source[c * 64 + y * 8 + 5] != 0)
*(screen + 5) = source[c * 64 + y * 8 + 5];
if (source[c * 64 + y * 8 + 6] != 0)
*(screen + 6) = source[c * 64 + y * 8 + 6];
if (source[c * 64 + y * 8 + 7] != 0)
*(screen + 7) = source[c * 64 + y * 8 + 7];
screen += SCREEN_WIDTH;
}
The references to the loop variable x have now been replaced by the
constant values.
There are a number of ways that this could be further optimised, like
precalculation, etc. I'll leave this as an exercise for the reader. 8^)
--
TTFN
Sly (Steve)
- Raw text -