Mail Archives: pgcc/1999/03/05/06:58:25
From: | "David Jonsson" <David DOT Jonsson AT ellemtel DOT se>
|
To: | <pgcc AT delorie DOT com>
|
Subject: | SSI/KNI support (was RE: Intel/Cygnus)
|
Date: | Fri, 5 Mar 1999 12:57:34 +0100
|
Message-ID: | <000001be66ff$5c17a660$3bd16482@ellemtel.se>
|
MIME-Version: | 1.0
|
X-Priority: | 3 (Normal)
|
X-MSMail-Priority: | Normal
|
X-Mailer: | Microsoft Outlook 8.5, Build 4.71.2173.0
|
Importance: | Normal
|
In-Reply-To: | <19990304152121.42144@insula.local>
|
X-MimeOLE: | Produced By Microsoft MimeOLE V5.00.0810.800
|
X-MIME-Autoconverted: | from quoted-printable to 8bit by delorie.com id GAA03055
|
Reply-To: | pgcc AT delorie DOT com
|
> > ----------
> > From: Philipp Rumpf[SMTP:PRUMPF AT JCSBS DOT LANOBIS DOT DE]
> > Sent: Thursday, March 04, 1999 4:21:21 PM
> > To: pgcc AT delorie DOT com
> > Subject: Re: Intel/Cygnus
> > Auto forwarded by a Rule
> >
> > This is far from trivial. The C syntax need to be abandoned if
> the optimization
> > is to be transparent from the programmer, see SWAR
> http://shay.ecn.purdue.edu/~swar/
>
> I cannot see what is so difficult about it[1] ... I think it is
> just a special case of loop unrolling.
>
> char *p;
> int i;
>
> for(i=0; i<4; i++)
> p[i] |= 0x80;
>
> should become a 32-bit OR ... once we can do that, the rest of
> SIMD should be trivial[2]
What you write is trivial if it is allowed. I am no compiler expert but I don't think that a compiler is allowed to unroll that loop. It isn't obvious that p and i are independent. Or p[0] and p[1] etc.
> > Another approach is to use a MACRO like addition to ordinary compilers.
> > This is what Apple has done with AltiVec wich is more promising than MMX
> > or KNI/SSI, http://developer.apple.com/hardware/altivec/model.html
>
> Intel is doing something very similar in their compilers, they
> even give the
> compiler intrinsics or whatever they call them in the instruction
> set reference ...
Like libmmx below?
> The macro approach has additional advantages though, I really
> would not like to get
> 11 bits precision for a normal float though I probably would not
> mind sometimes.
This is enough many times like for sound-processing or simple geometry.
> [2] - Well, it could be a bit difficult to ensure a float * is
> 128-bit aligned ...
Just align all memory on 128-bit boundaries when compiling or what about a new type like Randy Fisher's libmmx http://min.ecn.purdue.edu/~rfisher/Research/Libmmx/libmmx.html
typedef union {
long long long long o; /* Octalword (128-bit) value */
unsigned long long long long uo; /* Unsigned Octalword */
int d[4]; /* 4 Doubleword (64-bit) values */
unsigned int ud[4]; /* 4 Unsigned Doubleword */
short w[8]; /* 8 Word (16-bit) values */
unsigned short uw[8]; /* 8 Unsigned Word */
char b[16]; /* 16 Byte (8-bit) values */
unsigned char ub[16]; /* 16 Unsigned Byte */
float s[4]; /* Single-precision (32-bit) value */
} __attribute__ ((aligned (16))) ssi_t; /* On an 16-byte (128-bit) boundary */
He also defines macros making it possiblem to write like paddd_m2r(variable, mm0)
I asked him in december if he should support SSI but he said he had full time with MMX. I hope all extra instructions like 3Dnow! MMX SSI can be in the same .h file.
David
- Raw text -