delorie.com/archives/browse.cgi   search  
Mail Archives: pgcc/1999/03/04/10:18:54

Message-ID: <19990304152121.42144@insula.local>
Date: Thu, 4 Mar 1999 15:21:21 +0000
From: Philipp Rumpf <prumpf AT jcsbs DOT lanobis DOT de>
To: pgcc AT delorie DOT com
Subject: Re: Intel/Cygnus
References: <36DD6D94 DOT 79AFEC8F AT mitre DOT org> <000f01be6632$02e96240$3bd16482 AT ellemtel DOT se>
Mime-Version: 1.0
X-Mailer: Mutt 0.89.1
In-Reply-To: <000f01be6632$02e96240$3bd16482@ellemtel.se>; from David Jonsson on Thu, Mar 04, 1999 at 12:27:39PM +0100
X-Accept-Language: en,de,se
Reply-To: pgcc AT delorie DOT com

> This is far from trivial. The C syntax need to be abandoned if the optimization
> is to be transparent from the programmer, see SWAR http://shay.ecn.purdue.edu/~swar/

I cannot see what is so difficult about it[1] ... I think it is just a special case of
loop unrolling.

char *p;
int i;

for(i=0; i<4; i++)
	p[i] |= 0x80;

should become a 32-bit OR ... once we can do that, the rest of SIMD should be
trivial[2]

> Another approach is to use a MACRO like addition to ordinary compilers.
> This is what Apple has done with AltiVec wich is more promising than MMX
> or KNI/SSI, http://developer.apple.com/hardware/altivec/model.html

Intel is doing something very similar in their compilers, they even give the
compiler intrinsics or whatever they call them in the instruction set reference ...

The macro approach has additional advantages though, I really would not like to get
11 bits precision for a normal float though I probably would not mind sometimes.

[1] - I know about nothing about gcc/egcs/pgcc internals, so there may be something
important I missed

[2] - Well, it could be a bit difficult to ensure a float * is 128-bit aligned ...

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019