Mail Archives: pgcc/1999/03/16/17:59:54
On Tue, Mar 16, 1999 at 08:33:48PM +0100, Axel Thimm wrote:
> We are currently trying to see what we can drain maximally from PII for a
> certain flop intensive application (QCD). Until now folks were using gcc 2.8.1
> with -O2 -fomit-frame-pointer. I thought I might surprise them with egcs or
> pgcc, but the perfomance dropped from 80 to 50 Mflop/s (?)
this can be related to a variety of factors, some are out of the scope of the
compiler (it warrants a whole book of its own). Here are the two most
prominent problems.
- double alignment. depending on how your program allocates memory for
doubles, it can, by pure luck, change from optimal to non-optimal.
- cache colouring (or lack thereof). Sometimes moving around data structures
will defter performance randomly (from run to run). some algorithms are
highly sensitive to these. Unfortunately, the compiler cannot help here.
Also, which os are you using, and which libc (if on linux?) Most x86
operating systems don't align the stack to an 8 byte boundary, which makes it
luck again if the code runs fast or slow.
Also, others have pointed out higher optimization levels that help in an
unrelated way.
you might also want to try -malign-double (and hope your libraries work
with that switch). It will align all doubles in structures correctly (that
rarely improves performanc,e but when it does, its by some 30% or more).
>
> [This was pgcc 1.1, as I cannot compile any newer snapshot/CVS, see related
> mail in this list]
I don't htink it is related to that version (regardless of what I said
below).
>
> Now I know of gcc to egcs regression, but I thought that pgcc was atop of both
There is no realy regression regarding technology, though. Unlike gcc, the
releases have disabled more optimization than necessary, to be as stable as
possible (more stable than say gcc-2.8). The current snapshots both are
faster on average than gcc.
> Is this a known fact? Have others made similar experiences? The program is
x86 fp performance is veeery sensitive to environment issues.
> memory intensive (small ratio of computations per memory accesses) and perhaps
> this is what makes the difference.
It might. Cahce line aliasing can make up to 200% difference in runtime.
--
-----==- |
----==-- _ |
---==---(_)__ __ ____ __ Marc Lehmann +--
--==---/ / _ \/ // /\ \/ / pcg AT goof DOT com |e|
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+
The choice of a GNU generation |
|
- Raw text -