From: buers AT gmx DOT de (Dieter Buerssner) Newsgroups: comp.os.msdos.djgpp Subject: Re: [long] gcc performance and possible bug Date: 10 Mar 2000 22:07:58 GMT Lines: 51 Message-ID: <8abrnt$3i4e8$1@fu-berlin.de> References: <8a65uu$39fkt$1 AT fu-berlin DOT de> <200003101730 DOT MAA22032 AT indy DOT delorie DOT com> NNTP-Posting-Host: pec-3-222.tnt2.s2.uunet.de (149.225.3.222) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Trace: fu-berlin.de 952726078 3740104 149.225.3.222 (16 [17104]) X-Posting-Agent: Hamster/1.3.13.0 User-Agent: Xnews/03.02.04 To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com Eli Zaretskii wrote: > P166/gcc 2.7.2.1 P166/gcc 2.95.2 C400/2.7.2.1 C400/2.95.2 > > non-const 0.18053 0.06059 0.05445 0.01515 > > const 0.30990 0.17480 0.05486 0.01556 Thank you Eli, for reporting these numbers. At least you can confirm that there can be a huge difference (a factor of three with your numbers) by adding the little word const. The large difference between gcc 2.7.2.1 and 2.95.2 can be explained by some microoptimization I made for gcc 2.95.2. Originally, I had implemented the offending function with inline assembly for djgpp, because the C implementation (with earlier versions of gcc) produced much worse code. The inline version does show the same weird behaviour as the C version on my machine. >I'm not sure what does this mean, but it looks like alignment-related >problems. Otherwise, I cannot explain how come the same executable >runs with the same speed on one machine, but not on another. Yes, I cannot explain it either. But at least with my computer, it does not seem to be alignment. After editing gcc -O2 -S output, and aligning everything to 16 byte boundaries, and rechecking the alignment in the executable, I found the same behaviour. Also, is it expected that P166 needs better alignment, than C400? The only difference between const and nonconst version I can see is mul memref where memref is in the text/code segment in one case, and in the data segment in the other case. But still this is not enough to explain it, because even with memref in the text segment, I sometimes (depending on compiler options and minor changes in the source) get the same performance with const. The non-const version will always be as fast, as can be expected. >There's no 1:10 speed difference, which seems to confirm what Salvador >said: you are seeing some K-6-specific effect. Unfortunately (for me), this seems to be the case. But also, your 1:3 ratio on P166 should not be acceptable. [Salvador, I will send you the requested source when my mail provider functions again.] Regards, Dieter