Mail Archives: djgpp/1997/03/02/21:16:18
> This is a very simple function (but also very important in numerical
> applications). Understanding how to coerce GCC into producing near
> optimal code (without obfuscating the source) for the matrix
> multiplication problem would be very beneficial to my work, since the
> required "tricks" should be widely applicable to my code. I would
> like to hear any further thoughts or ideas on this subject.
The thing I have found about the FPU code generated by gcc is nothing
short of weird... :) I am coding for the pentium, and as a result have
started converting my entire texture mapping routines to asm and doing the
optimizations myself because DJGPP does a lot of repetitive things. Look
at the code generated everytime you want to store an integer.
Question: Is there any way to let DJGPP know I am running in single
precision mode, and stop doing all the crap it does everytime I wan't
to store an integer? It would save a lot of hassle. I don't have the code
on me, so this is from memory, but to do a fistp myself running in single
precision, I normally have something like:
flds _a;
fmuls _b;
fadds _c;
fistpl _d;
Whereas DJGPP goes:
flds _a;
fmuls _b;
fadds _c;
fstcw -4(%ebp);
fnldcw -8(%ebp);
fistp _d;
fldcw -4(%ebp);
etc etc, only with a _lot_ more crap in the middle. Basically, I want
to inform DJGPP _not_ to load and store the FPU control word information
as there is no need! Especially when I am doing a lot of fistp's, as this
just kills performance...
Leathal.
- Raw text -