delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2000/03/10/20:47:59

From: buers AT gmx DOT de (Dieter Buerssner)
Newsgroups: comp.os.msdos.djgpp
Subject: Re: [long] gcc performance and possible bug
Date: 10 Mar 2000 22:07:58 GMT
Lines: 51
Message-ID: <8abrnt$3i4e8$1@fu-berlin.de>
References: <Pine DOT SUN DOT 3 DOT 91 DOT 1000307103019 DOT 21628J-100000 AT is> <8a65uu$39fkt$1 AT fu-berlin DOT de> <200003101730 DOT MAA22032 AT indy DOT delorie DOT com>
NNTP-Posting-Host: pec-3-222.tnt2.s2.uunet.de (149.225.3.222)
Mime-Version: 1.0
X-Trace: fu-berlin.de 952726078 3740104 149.225.3.222 (16 [17104])
X-Posting-Agent: Hamster/1.3.13.0
User-Agent: Xnews/03.02.04
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Eli Zaretskii wrote:

>      P166/gcc 2.7.2.1   P166/gcc 2.95.2   C400/2.7.2.1   C400/2.95.2
>
> non-const  0.18053          0.06059          0.05445       0.01515
>
>  const     0.30990          0.17480          0.05486       0.01556

Thank you Eli, for reporting these numbers. At least you can confirm
that there can be a huge difference (a factor of three with your
numbers) by adding the little word const. 

The large difference between gcc 2.7.2.1 and 2.95.2 can be explained
by some microoptimization I made for gcc 2.95.2. Originally, I had
implemented the offending function with inline assembly for djgpp,
because the C implementation (with earlier versions of gcc) produced
much worse code. The inline version does show the same weird behaviour
as the C version on my machine.

>I'm not sure what does this mean, but it looks like alignment-related
>problems.  Otherwise, I cannot explain how come the same executable
>runs with the same speed on one machine, but not on another.

Yes, I cannot explain it either. But at least with my computer,
it does not seem to be alignment. After editing gcc -O2 -S output,
and aligning everything to 16 byte boundaries, and rechecking the
alignment in the executable, I found the same behaviour.

Also, is it expected that P166 needs better alignment, than C400?

The only difference between const and nonconst version I can see is

  mul memref

where memref is in the text/code segment in one case, and in the
data segment in the other case.  But still this is not enough to
explain it, because even with memref in the text segment, I sometimes
(depending on compiler options and minor changes in the source) get
the same performance with const. The non-const version will always
be as fast, as can be expected.

>There's no 1:10 speed difference, which seems to confirm what Salvador
>said: you are seeing some K-6-specific effect.

Unfortunately (for me), this seems to be the case. But also, your
1:3 ratio on P166 should not be acceptable.

[Salvador, I will send you the requested source when my mail provider
functions again.]

Regards, Dieter

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019