Mail Archives: djgpp/2000/10/19/19:15:26
gdemont AT my-deja DOT com wrote:
> Some of the programs I have under hand do run slower.
*How much* slower? It's really hard to speak about optimization
differences if their size is not known. Is it: a percent slower?
Fifty? Taking 5 times as long?
And what's that code doing, in the first place?
> Target is a Pentium-S 166Mhz.
> The options are
> * For gcc 2.7.2 (gnat 3.10):
> -i -gnatpn -O2 -fomit-frame-pointer -funroll-loops
> -m486 -malign-loops=2 -malign-jumps=2 -malign-functions=2
'-funroll-loops' hardly does you any good, on any x86 type machine. It
increases the code size, and as soon as the amount of code being
looped over (the 'active set') goes beyond the size of the 1st level
Cache, you'll receive a noticeable performance degradation. Same as
you cross other size barriers. Pentium-class machines are good enough
at branching (and branch prediction, in particular) that unrolling
loops doesn't gain you terribly much, anyway, before the cache
coherency loss strikes back.
For further exploration, a look at compiler input (source) and output
(assembly) would be necessary. And of course some profiling to see
where the code spends the majority of its time, in the first place, so
the scrutiny can be limited in scope.
The version of the DJGPP runtime and binutils also can play an
important role. Fully correct alignment of the code on 32byte
boundaries helps, but it took us some iterations to get it right.
--
Hans-Bernhard Broeker (broeker AT physik DOT rwth-aachen DOT de)
Even if all the snow were burnt, ashes would remain.
- Raw text -