delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1999/08/06/05:49:58

Date: Thu, 5 Aug 1999 16:29:36 +0300
From: Alexander Bokovoy <bokovoy AT bspu DOT unibel DOT by>
X-Mailer: The Bat! (v1.33) UNREG / CD5BF9353B3B7091
Organization: BSPU named after Maxim Tank
X-Priority: 3 (Normal)
Message-ID: <17687.990805@bspu.unibel.by>
To: Duncan Coutts <djgpp AT delorie DOT com>
Subject: Re: Data Alignment for Optimal Access
In-reply-To: <7oak0j$11v$1@lure.pipex.net>
References: <7oak0j$11v$1 AT lure DOT pipex DOT net>
Mime-Version: 1.0
Reply-To: djgpp AT delorie DOT com
X-Mailing-List: djgpp AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

You may take a look at PGCC - version of GCC optimized for Pentium
Processors. It does good work for using MMX instructions (and
therefore MMX-friendly alignment of data). Also, AFAIK latest GCC (2.95 -
http://www.lanet.lv/~pavenis/djgpp.html) supports MMX optimization.

On 05.08.1999 Duncan Coutts wrote:
> I know that gcc (on PC targets) aligns data to 4 byte
> boundaries because its normal word length is
> 32bit - 4byte. Performance of some types of operations
> can be significantly improved if different alignments are
> used.

> For example, the Intel MMX tutorials strongly encourage
> quad word (64bit) memory operations to be quad word
> aligned (1 cycle vs 3 (when cached)).
> Also, by aligning larger structures (such as matrices) on
> 32 byte boundaries, caching performance can be
> improved (each cache line is 32 bytes large).

> These optimisations obviously should only be considered
> when doing assembly optimisation of time critical loops
> (such as matrix and vector operations in a 3D graphics
> pipeline).

> Does anyone have any suggestions on how to do the
> memory alignment? Are there any compiler exensions
> similar to   __atribute__ ((packed))  ? I know gcc aligns
> data on the stack, however I suspect that it would not be
> possible to force 8 byte alignment for local variables or
> parameters.

> Dynamic storage seems the only other possibility.
> The new operator aligns allocations to 4 byte boundaries.
> I could over allocate by 4 bytes and then do some bit
> twiddling to force a pointer to a 8 byte boundary.
> What's the nicest way of doing this? Perhaps I should
> overload the matrix class' new operator to do the special
> allocations.


Best regards,
Alexander Bokovoy, <bokovoy AT bspu DOT unibel DOT by>
= Linux ==============================================================
Though it is always possible to have a look at the world through the
Windows, people usually prefer not only to look but live in it too.
============================================================== Linux =


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019