delorie.com/archives/browse.cgi   search  
Mail Archives: pgcc/2000/02/02/16:00:19

Date: Wed, 2 Feb 2000 20:29:26 +0100 (CET)
From: Martin Ockajak <mandos AT hq DOT alert DOT sk>
To: pgcc AT delorie DOT com
Subject: Re: pgcc and egcs alignment -- function, basic block and string
In-Reply-To: <20000130211158.D641@cerebro.laendle>
Message-ID: <Pine.LNX.4.21.0002022017450.16833-100000@hq.alert.sk>
MIME-Version: 1.0
Reply-To: pgcc AT delorie DOT com
Errors-To: dj-admin AT delorie DOT com
X-Mailing-List: pgcc AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

On Sun, 30 Jan 2000, Marc Lehmann wrote:

> > 10% is really a lot, inside a loop, which takes (about) 25 * 35 cycles.
> 
> That's very much. I doubt it really is the three nops, but...

Well, AFAIK K6 family (especially K6-1) is pretty sensitive to
splitting insns over cache line boundary. Such cases slow down the
decoding of instruction. Considering importance of decoders'
performance on K6 and loop length (only 25-35 cycles as being said)
and assuming some longer insns was split this way, 10% difference
is IMHO possible.

BTW: On my K6-2, I get best performance when loops and functions are
aligned to 8 byte boundary. But this (as well as cache line end issues)
deserves more testing, so I will do so during weekend.

Have a nice day

------------------------------------------------------------------------------
Martin Ockajak a.k.a. Mandos  <mandos AT hq DOT alert DOT sk>  http://hq.alert.sk/~mandos
"The goal of Computer Science is to build something that will last at
least until we've finished building it."

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019