delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2000/04/27/10:19:01

Date: Thu, 27 Apr 2000 09:15:39 -0400
Message-Id: <200004271315.JAA03722@delorie.com>
X-Posting-Agent: Hamster/1.3.13.0
Newsgroups: comp.os.msdos.djgpp
Posted-and-Mailed: yes
Subject: Re: [long] gcc performance and possible bug
From: buers AT gmx DOT de (Dieter Buerssner)
References: <39046544 DOT ADB90632 AT inti DOT gov DOT ar>
User-Agent: Xnews/03.02.04
To: salvador <djgpp AT delorie DOT com>
Mime-Version: 1.0
Reply-To: djgpp AT delorie DOT com

[posted and mailed]

salvador wrote:

[Adding one const to a static variable can degrade performance
 by a factor of ten on my AMD K6-2]
>  I found the problem, when you make the value a constant it is stored
>in the code segment, too close to the code that is executed. It looks
>like that's close enough to be inside the pipeline (probably because the
>CPU is fetching groups of aligned bytes, not just code). Looks like K6
>does some pipeline flush when you read/write data that is inside the
>pipeline (or perhaps a pre-fetch buffer is a better name for it).

Salvador, thanks for looking into this. 

>  If you put zseed inside the text segment and close to the function
>things gets even worst. I got 1:7 and 1:16 for the second case. 

At least, I am not alone, and I was not hallucinating.

>But if you move both variables quite far (inside of the .text segment)
>you'll get the same times that when using the variables in .data.

Yes, I have seen something like that. But this still doesn't explain
everything. Minor changes to the code (outside of the offending
function), will make the const und non-const version perform
at the same speed. This changes can obviously change alignment.
But the alignment of the variable alone is not the reason 
(As you also said in an other post), because "hand-aligning" in the 
assembler code (and rechecking with objdump and fsdb) doesn't yield 
coherent results.

>  BTW: If you use const *and* compile it as C++ you'll get faster code
>because C++ allows the compiler to use consts as C uses #define  macros if
>the compiler considers that's favorable.

This of course is true. But I think to the special case, it can't make
a difference, because there AFAIK is no MUL im32 instruction.

BTW. I haven't seen this weird behaviour on linux (with the same 
compiler and binutils version).

Is there a switch for gcc, that causes it not to store const data
in the code segment. This might help not only my AMD CPU, but also
other CPUs, as Eli reported a 1:3 speed difference with P166.
It may even be desirable to default to such a switch for special
-mcpu or for compiling with -O and without -g.

-- 
Regards, Dieter Buerssner

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019