Message-ID: <19990520104723.18724@atrey.karlin.mff.cuni.cz> Date: Thu, 20 May 1999 10:47:23 +0200 From: Jan Hubicka To: pgcc AT delorie DOT com Subject: Re: Benchmark PGCC vs EGCS on a K6-2 References: <373F3AA2 DOT A446D611 AT informatik DOT hu-berlin DOT de> <19990519105631 DOT 40676 AT atrey DOT karlin DOT mff DOT cuni DOT cz> <3743ADE8 DOT C938ADBB AT informatik DOT hu-berlin DOT de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 8bit X-Mailer: Mutt 0.84 In-Reply-To: <3743ADE8.C938ADBB@informatik.hu-berlin.de>; from Jens-Uwe Rumstich on Thu, May 20, 1999 at 06:38:32AM +0000 Reply-To: pgcc AT delorie DOT com X-Mailing-List: pgcc AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk > Hi! > > First, please donīt trust the numbers I posted at all. The switches > "pgcc -mk6 -O3" and > "pgcc -mk6 -O4" produce the same executable, but the results had 3 > seconds difference. Too > much to call these results reliable :-(( > > > About year ago I've done some tunning of egcs for K6-2. I've removed some of > > K6-2 specific optimizations, because they seemed to produce slower code. There > > seems to be important problem in K6 documentation. It recommends thinks that often > > causes performance loss. Author of original K6 stuff for egcs just blindly followed > > their recommendations so many of his changes were performance miss (especially changking > > xor reg,reg to mov reg,0) > > ooops... The mov is not faster? > Only advantage of mov is reduced dependency on flags. But this advantage is not high enought to mask code size increase and decoder slowdown caused by very large opcode. > > Many (not all) of this changes are in recent egcs snapshots (aka gcc 2.95.0). Because > > I don't have any access to this CPU anymore, I would love to hear about your results with > > this version of gcc. > > Iīll try them out and write about them. > Do you know a way to get exact numbers? I still donīt know, why my > results are that wrong :-( Hmm... don't know. You might also try out egcs benchmark suite. It gets lots of results and they are pretty exact (at least for me) and useable for tunning the compiler. Take a look at egcs homepage to get it... > > > K6 seems to have serious problems with decoding speed. I've made new haifa scheduler hooks for > > decoding that worked quite well (I have also version for Pentium and PPro available, PPro > > version is untested), > > It seems to me, that the decoders of the K6 are not strong enough to > feed all the execution > units, so this is the bottleneck. One should probably try to output > instructions, which > result in 4 Risc-Ops per cycle. Means 2 short instructions, where each > one is breaken into 2 Risc-Ops or a Long Instruction, which is broken > into 4 RiscOps. Well, this is quite hard to reach. IMO it is OK just to schedule code in a way, that more complex (first decoder only) instruction are reached by first scheduler and vector decoded instruction are placed to points, where decoding of previous instruction was faster than execution so 2 cycle delay will not change performance loss. > In the PGCC-FAQ I read about an "recombining"-optimization, which seems > to be intended to do exactly this. But it was marked as disabled, > because it may slow down some code... Recombining as far as I can remember only attempts to reverse riscify optimization for Pentium CPU. This change caused performance loss due to reduced pairing oportunities. It should not affect non-riscified code anyway. > > > On K6 it brought quite large speedups (-10 - 500%, usually about 10%), but changes necesarry > > to i386.md are quite large so it would take lots of time to add them into gcc. > > And would it make look even uglier, right ?? ;-) Well, I personally think it made it look a bit better, because I've rewrote many patterns in a way that instruction selection is clear to compiler (and let me to decide, what instruction will be on the output using attributes). Honza > > > Honza > > cu > Jens-Uwe -- OK. Lets make a signature file. +-------------------------------------------------------------------------+ | Jan Hubicka (Jan Hubi\v{c}ka in TeX) hubicka AT freesoft DOT cz | | Czech free software foundation: http://www.freesoft.cz | |AA project - the new way for computer graphics - http://www.ta.jcu.cz/aa | | homepage: http://www.paru.cas.cz/~hubicka/, games koules, Xonix, fast | | fractal zoomer XaoS, index of Czech GNU/Linux/UN*X documentation etc. | +-------------------------------------------------------------------------+