delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/03/04/09:52:07

From: "Sly" <sly AT aussie DOT net>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Loop unrolling: What is it?
Date: 4 Mar 1997 03:45:50 GMT
Organization: Sly
Lines: 65
Message-ID: <01bc284f$0fd25020$8c081ecb@sly>
References: <Pine DOT SUN DOT 3 DOT 91 DOT 970303130541 DOT 9009A-100000 AT is> <5ffjqj$g5a AT nr1 DOT toronto DOT istar DOT net>
NNTP-Posting-Host: max0ppp10.bne.aussie.net
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

Jeff Weeks <pweeks AT execulink DOT com> wrote in article
<5ffjqj$g5a AT nr1 DOT toronto DOT istar DOT net>...
> All this talk about unrolling loop and I feel like an idiot :)  What
> exactly is invovled in the process of unrolling loops?
> 

Loop unrolling is the simple optimisation process of taking a constant
iteration loop and reducing the number of iterations performed.  For
example, you may have a routine that draws bitmapped 8x8 characters onto
the screen (assuming characters are stored as 8 rows of 8 pixels)...

unsigned char *source;	/* Pointer to character bitmap data */
unsigned char *screen;	/* Pointer to screen position where character will
be drawn */
int x, y;			/* Loop variables */
unsigned char c;	/* ASCII value to draw */

for (y=0; y<8; y++)
{
	for (x=0; x<8; x++)
	{
		if (source[c * 64 + y * 8 + x] != 0)
			*(screen + x) = source[c * 64 + y * 8 + x];
	}
	screen += SCREEN_WIDTH;
}

I know that is example is VERY unoptimised, but it is just an example after
all.

This example's inner loop executes 8 times a very simple statement.  To
reduce the time taken performing branches and tests to see if the loop has
ended yet, we can simply execute the inner statement 8 times with
appropriate values already inserted.

for (y=0; y<8; y++)
{
	if (source[c * 64 + y * 8 + 0] != 0)
		*(screen + 0) = source[c * 64 + y * 8 + 0];
	if (source[c * 64 + y * 8 + 1] != 0)
		*(screen + 1) = source[c * 64 + y * 8 + 1];
	if (source[c * 64 + y * 8 + 2] != 0)
		*(screen + 2) = source[c * 64 + y * 8 + 2];
	if (source[c * 64 + y * 8 + 3] != 0)
		*(screen + 3) = source[c * 64 + y * 8 + 3];
	if (source[c * 64 + y * 8 + 4] != 0)
		*(screen + 4) = source[c * 64 + y * 8 + 4];
	if (source[c * 64 + y * 8 + 5] != 0)
		*(screen + 5) = source[c * 64 + y * 8 + 5];
	if (source[c * 64 + y * 8 + 6] != 0)
		*(screen + 6) = source[c * 64 + y * 8 + 6];
	if (source[c * 64 + y * 8 + 7] != 0)
		*(screen + 7) = source[c * 64 + y * 8 + 7];
	screen += SCREEN_WIDTH;
}

The references to the loop variable x have now been replaced by the
constant values.

There are a number of ways that this could be further optimised, like
precalculation, etc.  I'll leave this as an exercise for the reader.  8^)

-- 
TTFN
Sly (Steve)

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019