Message-ID: <347E0079.3E92@seanet.com> Date: Thu, 27 Nov 1997 15:21:29 -0800 From: "Alan M. Doerhoefer" MIME-Version: 1.0 To: "William A. Barath" , eliz AT is DOT elta DOT co DOT il CC: djgpp AT delorie DOT com Subject: Re: binutils 2.8.1 References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Precedence: bulk Eli Zaretskii wrote: > There are several -falign-XXX switches. I suggest trying them. OK, I am guessing you are talking about the -malign-loops=NUM, etc., switches? (I couldn't find any "-falign..." switches in INFO.) According to the gcc info-docs, using -m386 or -m486 automatically switches the default alignment for -malign-functions= and -malign-jump= between 2 and 4 respectively. The only switch not affected is -malign-loops= , so I tried as many combinations of -m386 or -m486 and -malign-loops= and -malign-double as I could think of. With binu2.8.1, using -m386 instead of -m486 caused a very tiny decrease in speed: less than a tenth of a second slower. With binu2.7, using -m386 actually INCREASED the speed by about a tenth of a second. Changing -malign-loops= to 4 caused no change with either bin2.8.1 or 2.7. Adding -malign-double DECREASED the speed by nearly 1.5 seconds with BOTH versions of binutils. To allow anybody to duplicate some of my tests, I am posting a small program that simulates the Mandelbrot math from my large program. This code just follows the floating point loops and tests without actually plotting any pixels. This is not optimized in any way. My actual code uses a circle avoidance and guessing algorithm. After several seconds of computing, it will print the time elapsed in seconds and the number of clock ticks, all based on the very low-resolution "clock()" function. ----------start sample program--------------------- #include #include typedef struct ComplexNum { double real; /* the complex number struct */ double imag; } ComplexNum; /* The mandlebrot math; tests z = (z^2) + c maxiter times and returns color */ inline int test_point( ComplexNum c, int maxiter ) { int i, color = 0; /*double ycenter, ylcenter, xbcenter, xscenter;*/ double length; ComplexNum z, x; z.real = 0.0; z.imag = 0.0; /* uncomment to avoid testing in certain large circles:*/ /* ylcenter = c.imag; ycenter = ylcenter * ylcenter; xbcenter = c.real + 0.25; xscenter = c.real + 1.0; if ((xbcenter * xbcenter + ycenter) <= 0.2304 || (xscenter * xscenter + ycenter) <= 0.0529) return (0); */ for (i = 1; i <= maxiter; ++i) { x.real = (z.real * z.real) - (z.imag * z.imag); x.imag = (z.real * z.imag) + (z.imag * z.real); z.real = x.real + c.real; z.imag = x.imag + c.imag; length = z.real * z.real + z.imag * z.imag; if (length > 4.0) { if (i > 254) color = (i % 253) + 1; else color = i; break; } } return (color); } int main(void) { int x, y, color; ComplexNum c; double start_y = 1.3, start_x = -2.3; double scale = 0.0053125; double t0, t1; t0 = (float)clock() / CLOCKS_PER_SEC; for (y = 10; y < 480; y++) { c.imag = start_y - (scale * y); for (x = 0; x < 640; x++) { c.real = start_x + (scale * x); color = test_point(c, 150); } } t1 = ((float)clock() / CLOCKS_PER_SEC) - t0; printf("%10.5lf seconds\n", t1); printf("%f clocks\n\n", t1 * CLOCKS_PER_SEC); return (0); } ----------------------end---------------------------- Compiling with -m486 -ffast-math, I get: binutils time clocks 2.8.1 5.93407 540 2.7 5.60440 510 Compiling with -m386 -ffast-math, I get: binutils time clocks 2.8.1 5.98901 545 2.7 5.4945 505 I also tested this in plain DOS (instead of Win95), and the numbers all showed similar ratios but about a tenth of a second faster. All of my test were done on a Pentium 100Mhz with 48 MB RAM and an Acer BIOS. A couple of thoughts: My timer is not very good; to really test this, we should use the Pentium's built-in instruction counter, but I need to learn how first. This is based on floating point operations tested on a Pentium. Who knows what would happen on a 386 or 486? I hope I haven't discouraged people from using binutils 2.8.1; as you can see, these are miniscule differences in speed I am nitpicking. Nobody should really notice a second or less speed difference. However, I have been trying to optimize my fractal program for some time, and I was alarmed when all my work seemed to disappear. Finally, someone asked if perhaps I had installed a new version of gcc without realizing it. Answer: Nope, it's definitely the same one. I only downloaded two files: djdev201.zip, and bnu281.zip. The rest I keep on floppies, including gcc. Happy Thanksgiving, Alan Doerhoefer aland AT seanet DOT com