From: elf AT netcom DOT com (Marc Singer)
Message-Id: <199701081948.LAA29707@netcom6.netcom.com>
Subject: Re: Fixed Point (Optimization)
To: deef AT pobox DOT oleane DOT com (Francois Charton)
Date: Wed, 8 Jan 1997 11:48:48 -0800 (PST)
Cc: djgpp AT delorie DOT com (DJGPP List Alias)
In-Reply-To: <32D3D12F.78F6@pobox.oleane.com> from "Francois Charton" at Jan 8, 97 05:54:07 pm
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 2965      

> Some important thing though is that fixed point multiplication are not to 
> be compared with floating point multiplies : it actually takes a multiply 
> and a shift to do a fixed point multiply... On my 486, fixed point 
> multiplies are on average a little slower than their floating point 
> equivalent.

My 32 bit version takes two shifts and an add.  I think that our gain
was in avoiding coercion.  On some CPUs, there is an instruction to
extract the necessary portions without shifting.  I think it is the
MIPS CPU that provides an exchange instruction that takes one cycle to
swap the high and low 16 bit values.

> 
> However, adds, subs, compares (!), multiplies by ints... were very fast 
> in fixed point... 
> 
> > 
> > The trouble with the fixed point math was that C does not permit us to
> > capture the 64 bit result of a 32 bit multiply.  This is a rather
> > serious problem in portable software. 
> 
> In DJGPP, you could alway cast to (long long) (64 bit int)... but this 
> would slow the process...

Our compilers had this, too.  However, DJGPP and these others use
subroutine calls for the 64 bit operations.  This was unacceptable as
it negated any performance gain achieved through using fixed point math.

> I usually use 22.10 bits fixed, insted of the traditional 16.16. 10
> bits decimal part accounts for three decimal places, which is
> usually enough.  And 16 is only 4 and a half decimals, which is not
> that much more. The good thing though is the 24 bits is about 8
> millions (signed), so you can safely multiply numbers up to a few
> thousands... and precisely divide numbers up to 8093.

I spent a day or so running numbers on the desired bit partition.  For
our application, we could have used an uneven partition, but it would
have complicated the math enough to be a performance loss on one of
the CPUs.  Also, I think that three decimal places was not quite
enough and we had to use eleven to guarantee sufficient accuracy.

> > I had to code for three
> > different CPUs and found, interestingly enough, that the Microsoft
> > x86 compiler was the only one that could not correctly optimize the
> > inline assembler.  It made some inappropriate assumptions about
> > register use that prevented us from using the assembly language
> > fixed-point math routines.  DJGPP did it correctly. ;-)
> >
> 
> Come on, this has to be a joke... A Microsoft program buggy??? and a free 
> program without bugs???

I know.  It is hard to believe that a SUPPORTED product would have
bugs.  But, don't get me started.  I encountered two nasty MS compiler
bugs last year and had to do some ridiculous gymnastics to make my
programs run.

It's been a struggle to convince clients that freeware (unsupported?)
can be better than costplusware (supported?).  There seems to be mass
hypnosis convincing people that buggy programs hawked by large
corporations are better than stable programs available on the net.

Cheers.

-- Marc Singer