delorie.com/archives/browse.cgi   search  
Mail Archives: pgcc/2000/01/28/22:09:04

Date: Sat, 29 Jan 2000 03:21:01 +0100
From: Jan Hubicka <hubicka AT atrey DOT karlin DOT mff DOT cuni DOT cz>
To: pgcc AT delorie DOT com
Subject: Re: pgcc and egcs alignment -- function, basic block and string
Message-ID: <20000129032101.A25630@atrey.karlin.mff.cuni.cz>
References: <38921CD6 DOT 2A725779 AT ix DOT netcom DOT com>
Mime-Version: 1.0
X-Mailer: Mutt 1.0i
In-Reply-To: <38921CD6.2A725779@ix.netcom.com>; from cbsears@ix.netcom.com on Fri, Jan 28, 2000 at 02:48:54PM -0800
Reply-To: pgcc AT delorie DOT com
Errors-To: dj-admin AT delorie DOT com
X-Mailing-List: pgcc AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

> In pgcc some basic blocks (loops?) are being aligned.
> These 16 byte blocks are ifetch blocks.
> Quoting Agner Fog, "While aligning data is always important,
> aligning code is not necessary on the PPlain and PMMX."

The alignment (4,,7) is consistent with Intel Optimizing Manual's
recommendation. Changing this value might require quite extensive testing to
prove your statement. For Pentium, the alignment 4,,7 seems to be win
according to my (simple) tests.
> He means with respect to instruction fetch, not cache line.
> Is this alignment a good idea?  It seems unnecessary from
> a processor point of view and it seems to increase
> the cache footprint.  The p2align 4,,7 means align min(2^4,7)
> and it means that there may be some padded nop instructions.
> This is a COST for ifetch alignment in addition to the
> cache footprint.
> 
>          cmpl $31,%ebx
>          jle .L1476
>          .p2align 4,,7
>         .L1471:
>          movl %edx,(%ebp)
>          addl $4,%ebp
> 
> In pgcc strings are being aligned to cache lines.
> But is alignment even necessary for strings?
It is. Consider memset/memcpy/strlen expanders. These can work
much better when they know that destination is word size aligned.
> 
> egcs has the same string (32) and basic block alignment (.p2align 4,,7)
> But it uses .align 4 (!) for functions.  I might point out that the gas
> documentation has a bug in the .align description saying that the
> operand is like the .p2align operand, the number of bits to shift.

I will verify this tommorow and in case you are correct, I will fix this bug.

(in both gas and gcc).
> 
> So in summary, I think that functions should be aligned to cache lines
> and that basic blocks and strings should not be aligned at all.
Gcc don't align every basic block. It uses alignments for top of loops, where
the alignment to ifetch block is necesary. Top of loop appearing at the very
end of ifetch blocks may cause stalls in the decoding process IMO.
Second alignment is dont after barriers, where situation is in many points
of view equivalent to function entry point.

Aligning to 16 byte boundary can be quite good tradeoff between code size
and cache line fetching effecienty. While function starting near end of
cache line is catastrophical, function starting in the middle of it is not
so bad.
Again Intel Optimizing Manual recommends this. I believe Intel did some experiments
before saying so.

Honza
> 
> Chris Sears
> cbsears AT ix DOT netcom DOT com

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019