Mail Archives: djgpp/2000/01/19/07:40:27
On 18 Jan 2000, Dieter Buerssner wrote:
> My CPU is AMD K6-2 266.
I don't know anything about K6. AFAIK, GCC's code is optimized
towards Intel's recommendations; I don't know how well these fit K6.
> gcc 2.9.2: flags -fomit-frame-pointer -ffast-math + indicated flags
You mean 2.95.2, right?
> -On -mcpu=k6 -On -march=k6
> -O 86383 92070 92070
> -O2 85852 86966 87009
> -O3 81476 89791 89814
> -O6 81421 89833 89818
>
> In all three cases -O produces the fastest code.
The differences are small enough to be explained by alignment. I
suggest to look at the code (disassemble inside a debugger) and see
how many targets of jmp and call instructions are misaligned. Intel
recommends them to be aligned on 16-byte boundaries, unless they are
more than 7 bytes far from this boundary. GCC 2.95.2 emits the
correct alignment directives (.balign 16,,7), but your Binutils mess
that up, because each .o file is aligned on 4-byte boundary instead of
16-byte. In effect, you are disrupting the CPU's prefetch queues,
which can have significant effect on performance.
> The produced code runs slower than code produced with gcc 2.6.3!
> The same was true for my old 486 66 and 386SX when comparing
> newer versions of gcc with 2.6.3.
You need to experiment with more optimization options than just -mcpu
and -march. GCC has lots of different optimization options, and -O2
turns on almost all of them; you should try to selectively turn on
only some of them. Section 14.2 of the FAQ refers to this, although
it's probably not up-to-date yet with the latest GCC releases.
Also, GCC tries very hard to align the stack on 8-byte boundary, and
that causes it to emit a lot of stack-alignment instructions (subl
%esp, 4 etc.). This could lose big time if your program doesn't need
this alignment. I suggest to experiment with the alignment-related
options.
> My conclusion is, to useally use -O only, and to still have
> an old version of gcc around.
There's nothing wrong with this conclusion, but I think there's lots
more to check before this conclusion is general enough.
- Raw text -