From: "Sly" Newsgroups: comp.os.msdos.djgpp Subject: Re: Loop unrolling: What is it? Date: 4 Mar 1997 03:45:50 GMT Organization: Sly Lines: 65 Message-ID: <01bc284f$0fd25020$8c081ecb@sly> References: <5ffjqj$g5a AT nr1 DOT toronto DOT istar DOT net> NNTP-Posting-Host: max0ppp10.bne.aussie.net To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Jeff Weeks wrote in article <5ffjqj$g5a AT nr1 DOT toronto DOT istar DOT net>... > All this talk about unrolling loop and I feel like an idiot :) What > exactly is invovled in the process of unrolling loops? > Loop unrolling is the simple optimisation process of taking a constant iteration loop and reducing the number of iterations performed. For example, you may have a routine that draws bitmapped 8x8 characters onto the screen (assuming characters are stored as 8 rows of 8 pixels)... unsigned char *source; /* Pointer to character bitmap data */ unsigned char *screen; /* Pointer to screen position where character will be drawn */ int x, y; /* Loop variables */ unsigned char c; /* ASCII value to draw */ for (y=0; y<8; y++) { for (x=0; x<8; x++) { if (source[c * 64 + y * 8 + x] != 0) *(screen + x) = source[c * 64 + y * 8 + x]; } screen += SCREEN_WIDTH; } I know that is example is VERY unoptimised, but it is just an example after all. This example's inner loop executes 8 times a very simple statement. To reduce the time taken performing branches and tests to see if the loop has ended yet, we can simply execute the inner statement 8 times with appropriate values already inserted. for (y=0; y<8; y++) { if (source[c * 64 + y * 8 + 0] != 0) *(screen + 0) = source[c * 64 + y * 8 + 0]; if (source[c * 64 + y * 8 + 1] != 0) *(screen + 1) = source[c * 64 + y * 8 + 1]; if (source[c * 64 + y * 8 + 2] != 0) *(screen + 2) = source[c * 64 + y * 8 + 2]; if (source[c * 64 + y * 8 + 3] != 0) *(screen + 3) = source[c * 64 + y * 8 + 3]; if (source[c * 64 + y * 8 + 4] != 0) *(screen + 4) = source[c * 64 + y * 8 + 4]; if (source[c * 64 + y * 8 + 5] != 0) *(screen + 5) = source[c * 64 + y * 8 + 5]; if (source[c * 64 + y * 8 + 6] != 0) *(screen + 6) = source[c * 64 + y * 8 + 6]; if (source[c * 64 + y * 8 + 7] != 0) *(screen + 7) = source[c * 64 + y * 8 + 7]; screen += SCREEN_WIDTH; } The references to the loop variable x have now been replaced by the constant values. There are a number of ways that this could be further optimised, like precalculation, etc. I'll leave this as an exercise for the reader. 8^) -- TTFN Sly (Steve)