From: Andrew Bainbridge Newsgroups: comp.os.msdos.djgpp Subject: DJGPP division optimisations Date: Mon, 13 Jul 1998 23:26:28 +0100 Organization: Virgin News Service Lines: 90 Message-ID: <35AA8994.3FC1@virgin.net> NNTP-Posting-Host: 194.168.71.22 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Precedence: bulk I wrote a fractal generator a while ago using DJGPP and Allegro and was quite pleased with the speed it ran at. But there is always room for improvement, so I started to look at it again this evening. I had suspected that DJGPP wasn't optimising the way it should, so I did a little test. Basically the fractal routine relied heavily on two divide operations on long integers. In the source code I used x /= 2; y /=2; for clarity, on the assumption that the compiler would replace these operations with shifts. However, it seems this isn't a valid assumption. Replacing the code with x >>= 1; y >>= 1; made a large speed up. To make things simpler I have made a shorter program that demonstrates the same problem. On my machine the test program takes 1.78 seconds to run using shifts and 5.08 seconds using divides. Can somebody tell me why this is happening. BTW I have tried compiling with all kinds of switches but mainly I use: gcc test.c -o test.exe -O3 -m486 -ffast-math -fomit-frame-pointer -lalleg Here is the code: #include #include #define WIDTH 640 #define HEIGHT 480 #define MAX_POINTS 50000000 volatile int timer = 0; void inc_timer() { timer++; } END_OF_FUNCTION(inc_timer); int main() { long r, r2, r3, r4, r5; // Will store a random number below long x1=WIDTH/2-1, x3=WIDTH-1, y2=HEIGHT-1, y3=HEIGHT-1; long x = x1, y = 0; // The current cursor position long i; // Just a for loop counter float seconds; allegro_init(); install_keyboard(); install_timer(); // Install one of Allegro's timer routines LOCK_VARIABLE(timer); LOCK_FUNCTION(inc_timer); install_int(inc_timer, 10); // INCREMENT TIMER EVERY 1/100 SECOND printf("Benchmarking\n"); readkey(); timer = 0; for(i = MAX_POINTS; i; i--) { x += x3; y += y3; // x /= 2; // Swap these two lines for the two below // y /= 2; x >>= 1; y >>= 1; x += x1; y += y2; // x /= 2; // And these two // y /= 2; x >>= 1; y >>= 1; } seconds = timer; printf("%d, %d\n", x, y); // Need to print x and y to stop the // compiler from missing out the loop // altogether. seconds /= 100; printf("Took %2.2f seconds\n", seconds); return 0; }