From: buers AT gmx DOT de (Dieter Buerssner) Newsgroups: comp.os.msdos.djgpp Subject: Re: [long] gcc performance and possible bug Date: 27 Apr 2000 12:21:03 GMT Lines: 50 Message-ID: <8e9iij.3vs4qlb.1@buerssner-17104.user.cis.dfn.de> References: <39046544 DOT ADB90632 AT inti DOT gov DOT ar> NNTP-Posting-Host: pec-116-136.tnt8.s2.uunet.de (149.225.116.136) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Trace: fu-berlin.de 956838063 9142809 149.225.116.136 (16 [17104]) X-Posting-Agent: Hamster/1.3.13.0 User-Agent: Xnews/03.02.04 To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com [posted and mailed] salvador wrote: [Adding one const to a static variable can degrade performance by a factor of ten on my AMD K6-2] > I found the problem, when you make the value a constant it is stored >in the code segment, too close to the code that is executed. It looks >like that's close enough to be inside the pipeline (probably because the >CPU is fetching groups of aligned bytes, not just code). Looks like K6 >does some pipeline flush when you read/write data that is inside the >pipeline (or perhaps a pre-fetch buffer is a better name for it). Salvador, thanks for looking into this. > If you put zseed inside the text segment and close to the function >things gets even worst. I got 1:7 and 1:16 for the second case. At least, I am not alone, and I was not hallucinating. >But if you move both variables quite far (inside of the .text segment) >you'll get the same times that when using the variables in .data. Yes, I have seen something like that. But this still doesn't explain everything. Minor changes to the code (outside of the offending function), will make the const und non-const version perform at the same speed. This changes can obviously change alignment. But the alignment of the variable alone is not the reason (As you also said in an other post), because "hand-aligning" in the assembler code (and rechecking with objdump and fsdb) doesn't yield coherent results. > BTW: If you use const *and* compile it as C++ you'll get faster code >because C++ allows the compiler to use consts as C uses #define macros if >the compiler considers that's favorable. This of course is true. But I think to the special case, it can't make a difference, because there AFAIK is no MUL im32 instruction. BTW. I haven't seen this weird behaviour on linux (with the same compiler and binutils version). Is there a switch for gcc, that causes it not to store const data in the code segment. This might help not only my AMD CPU, but also other CPUs, as Eli reported a 1:3 speed difference with P166. It may even be desirable to default to such a switch for special -mcpu or for compiling with -O and without -g. -- Regards, Dieter Buerssner