Mail Archives: djgpp/1998/07/14/01:30:47
From: | Andrew Bainbridge <andrew DOT bainbridge AT virgin DOT net>
|
Newsgroups: | comp.os.msdos.djgpp
|
Subject: | DJGPP division optimisations
|
Date: | Mon, 13 Jul 1998 23:26:28 +0100
|
Organization: | Virgin News Service
|
Lines: | 90
|
Message-ID: | <35AA8994.3FC1@virgin.net>
|
NNTP-Posting-Host: | 194.168.71.22
|
Mime-Version: | 1.0
|
To: | djgpp AT delorie DOT com
|
DJ-Gateway: | from newsgroup comp.os.msdos.djgpp
|
I wrote a fractal generator a while ago using DJGPP and Allegro and was
quite
pleased with the speed it ran at. But there is always room for
improvement,
so I started to look at it again this evening. I had suspected that
DJGPP
wasn't optimising the way it should, so I did a little test.
Basically the fractal routine relied heavily on two divide operations
on long integers. In the source code I used x /= 2; y /=2; for clarity,
on
the assumption that the compiler would replace these operations with
shifts.
However, it seems this isn't a valid assumption. Replacing the code with
x >>= 1; y >>= 1; made a large speed up.
To make things simpler I have made a shorter program that demonstrates
the
same problem. On my machine the test program takes 1.78 seconds to run
using
shifts and 5.08 seconds using divides. Can somebody tell me why this is
happening. BTW I have tried compiling with all kinds of switches but
mainly I
use:
gcc test.c -o test.exe -O3 -m486 -ffast-math -fomit-frame-pointer
-lalleg
Here is the code:
#include <stdio.h>
#include <allegro.h>
#define WIDTH 640
#define HEIGHT 480
#define MAX_POINTS 50000000
volatile int timer = 0;
void inc_timer() {
timer++;
}
END_OF_FUNCTION(inc_timer);
int main() {
long r, r2, r3, r4, r5; // Will store a random number below
long x1=WIDTH/2-1, x3=WIDTH-1, y2=HEIGHT-1, y3=HEIGHT-1;
long x = x1, y = 0; // The current cursor position
long i; // Just a for loop counter
float seconds;
allegro_init();
install_keyboard();
install_timer();
// Install one of Allegro's timer routines
LOCK_VARIABLE(timer);
LOCK_FUNCTION(inc_timer);
install_int(inc_timer, 10); // INCREMENT TIMER EVERY 1/100 SECOND
printf("Benchmarking\n");
readkey();
timer = 0;
for(i = MAX_POINTS; i; i--) {
x += x3;
y += y3;
// x /= 2; // Swap these two lines for the two below
// y /= 2;
x >>= 1;
y >>= 1;
x += x1;
y += y2;
// x /= 2; // And these two
// y /= 2;
x >>= 1;
y >>= 1;
}
seconds = timer;
printf("%d, %d\n", x, y); // Need to print x and y to stop the
// compiler from missing out the loop
// altogether.
seconds /= 100;
printf("Took %2.2f seconds\n", seconds);
return 0;
}
- Raw text -