Mail Archives: djgpp/1996/12/16/07:51:39
Murray Stokely wrote:
>
> This is a little code snippet from a vesa lens effect I coded,
> The diameter and magnification of the lens are adjustable in realtime
> via the +/- keys so I need the lens calculating function as fast as
> possible. Someone gave me a good tip with my sqrt problem earlier in
> using the difference of squares to take out one of the multiplies but
> I'd like to take it a step further. This routine is actualy faster
> than I expected it would be on my 486dx4, but still every clock cycle
> counts ;) I know theres lots of room for improvement, so any tips
> would be appreciated.
>
Here are a few ideas (which should make your code a bit faster)...
void calculate_tfm(int diameter, char magnification)
{
int a,b,x,y,z,s;
int y2,x2y2;
int radius,rad2;
radius=diameter/2;
/* this is used a lot, let's precalculate */
rad2=radius*radius;
/* save a sqrt() */
s=abs(rad2 - (magnification * magnification));
/* use square symetry : 4 times less calculations */
y2=0;
for(y=0;y<radius;y++)
{
y2+=2*y+1;
for(x=0;x<radius;x++)
{
/* I' not sure whether these recursive formulae are useful, but I love
them */
x2y2+=2*x+1;
if (x2y2 >= s)
{
/* this can be improved : for x or y = 0 some values are calculated too
many times. maybe also the condition x2y2>=s can be taken out of the loop
*/
tfm[(y+radius)*diameter+(x+radius)]=(y+radius)*diameter+(x+radius);
tfm[(-y+radius)*diameter+(x+radius)]=(-y+radius)*diameter+(x+radius);
tfm[(y+radius)*diameter+(-x+radius)]=(y+radius)*diameter+(-x+radius);
tfm[(-y+radius)*diameter+(-x+radius)]=(-y+radius)*diameter+(-x+radius);
} else {
z=round(sqrt(rad2-x2y2));
a=round(x*magnification/z);
b=round(y*magnification/z);
tfm[(y+radius)*diameter+(x+radius)]=(b+radius)*diameter+(a+radius);
tfm[(-y+radius)*diameter+(x+radius)]=(-b+radius)*diameter+(a+radius);
tfm[(y+radius)*diameter+(-x+radius)]=(b+radius)*diameter+(-a+radius);
tfm[(-y+radius)*diameter+(-x+radius)]=(-b+radius)*diameter+(-a+radius);
tfm[(y+radius)*diameter+(x+radius)]=(b+radius)*diameter+(a+radius);
}
} // end of for x
} // end of for y
}
> ( I'll eventualy convert all doubles/floats to 16.16 fixed point, so
> skip that obvious MAJOR optimization )
Do you really need 16.16 : this is low on integer part : especially if
you have to compute squares, multiplies and things like that...
Isn't 22.10 enough (that's three decimal places), or even 25.7 (two
decimal places).
Francois
- Raw text -