Mail Archives: djgpp/2000/04/27/10:19:01
[posted and mailed]
salvador wrote:
[Adding one const to a static variable can degrade performance
by a factor of ten on my AMD K6-2]
> I found the problem, when you make the value a constant it is stored
>in the code segment, too close to the code that is executed. It looks
>like that's close enough to be inside the pipeline (probably because the
>CPU is fetching groups of aligned bytes, not just code). Looks like K6
>does some pipeline flush when you read/write data that is inside the
>pipeline (or perhaps a pre-fetch buffer is a better name for it).
Salvador, thanks for looking into this.
> If you put zseed inside the text segment and close to the function
>things gets even worst. I got 1:7 and 1:16 for the second case.
At least, I am not alone, and I was not hallucinating.
>But if you move both variables quite far (inside of the .text segment)
>you'll get the same times that when using the variables in .data.
Yes, I have seen something like that. But this still doesn't explain
everything. Minor changes to the code (outside of the offending
function), will make the const und non-const version perform
at the same speed. This changes can obviously change alignment.
But the alignment of the variable alone is not the reason
(As you also said in an other post), because "hand-aligning" in the
assembler code (and rechecking with objdump and fsdb) doesn't yield
coherent results.
> BTW: If you use const *and* compile it as C++ you'll get faster code
>because C++ allows the compiler to use consts as C uses #define macros if
>the compiler considers that's favorable.
This of course is true. But I think to the special case, it can't make
a difference, because there AFAIK is no MUL im32 instruction.
BTW. I haven't seen this weird behaviour on linux (with the same
compiler and binutils version).
Is there a switch for gcc, that causes it not to store const data
in the code segment. This might help not only my AMD CPU, but also
other CPUs, as Eli reported a 1:3 speed difference with P166.
It may even be desirable to default to such a switch for special
-mcpu or for compiling with -O and without -g.
--
Regards, Dieter Buerssner
- Raw text -