delorie.com/archives/browse.cgi   search  
Mail Archives: pgcc/1999/08/13/16:56:49

Date: Thu, 12 Aug 1999 18:25:14 +0200
From: Gaddoni Marco <marco DOT gaddoni AT imola DOT nettuno DOT it>
To: pgcc AT delorie DOT com
Subject: Re: optimizing for k6
Message-ID: <19990812182514.A1071@enterprise>
References: <Pine DOT GSO DOT 4 DOT 10 DOT 9908051303340 DOT 29067-100000 AT legolas DOT mdh DOT se> <37AC0114 DOT F3BC458A AT neuss DOT netsurf DOT de> <19990808155531 DOT 34641 AT atrey DOT karlin DOT mff DOT cuni DOT cz>
Mime-Version: 1.0
X-Mailer: Mutt 0.95.6i
In-Reply-To: <19990808155531.34641@atrey.karlin.mff.cuni.cz>; from Jan Hubicka on Sun, Aug 08, 1999 at 03:55:31PM +0200
Reply-To: pgcc AT delorie DOT com
X-Mailing-List: pgcc AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

On Sun, Aug 08, 1999 at 03:55:31PM +0200, Jan Hubicka wrote:
> > Henrik Berglund SdU wrote:
> > > 
> > > ftp://ftp.sinica.edu.tw/pub/doc/cpu/www.amd.com/K6/k6docs/pdf/21828a.pdf
> > > 
> > > -----------------------------------------------------------------------------
> > > Henrik DOT Berglund AT mds DOT mdh DOT se
> > > http://www.mds.mdh.se/~adb94hbd/
> > 
> > This is a long known document, it does some help in optimizing. But the
> > information is just too incomplete to get really good optimizations.
> > 
> > There is also a lot of mistakes in that document. I had a little
> > discussion
> > with AMD technical support, but they did not help :-(
> > AMD Technical Support wrote:
> I am just working on the K6 support for new ia32 brackend. You are right
> that the document is quite bad. It recommends you thinks that hurts
> and fails to tell you about details that really helps. But the AMD technical
> support is quite kind to answer all specific questions about the optimizations.
> 
> The most important optimizations for K6 seems to be alignment changes
> (K6 requires pretty weird alignment before every instruction with 2 byte and longer
> opcode, that is also not noticed in the docs) and the instruction selection
> (some instruction that are pretty common are vector decoded. Manual
> fails to document that. Probably most important for gcc
> were inc/dec with ling form and nonmemory operand, neg patterns, shift patterns
> and setcc.
> I've implemented lots of other stuff and results are pretty good IMO.
> byte benchmark optimized for 386  using old backend says 4.23/2.51
> (integer/fp index), new backend 4.55/2.41, visual ZC++ 4.40/2.42 and my current result is
> 4.89/2.61
> 
> Maybe I can write some sort of document describing most interesting surprised I've fond
> while playing with the optimizations.
> 
> Honza


Have you seen the 3dnow SDK? It if free from the AMD web site and contain
a cpu simulator for the k6-II (and k7 in the latest version, i think) that
show all kind of stalls and resource limits for (small) programs that you
simulate (you can see if a bubble in the pipeline is due to decode problem,
resource or pairing or ...). It can be used to reverse eng. the model of
the cpu they are using. Unfortunately you need to have ms visual c++ to 
produce the debug information that the simulator uses ...

Ciao. Marco.

-- 
This is not a Sig. (With homage to Magritte).

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019