From: gpt20 AT thor DOT cam DOT ac DOT uk (G.P. Tootell)
Newsgroups: comp.os.msdos.djgpp
Subject: Re: floating point is... fast???
Date: 20 Jan 1997 11:03:39 GMT
Organization: University of Cambridge, England
Lines: 61
Sender: gpt20 AT hammer DOT thor DOT cam DOT ac DOT uk (G.P. Tootell)
Message-ID: <5bvjeb$mji@lyra.csx.cam.ac.uk>
References: <5brd2e$dap AT lyra DOT csx DOT cam DOT ac DOT uk> <32e22337 DOT 2066519 AT ursa DOT smsu DOT edu>
NNTP-Posting-Host: hammer.thor.cam.ac.uk
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp


well i dug the big book of cycles out today. this is what it says..

		fdiv		fmul		idiv	imul	div	mul

486(7)		8-89		11-27		43/44	42	40	42

pentium		39-42		1-7		22-46	10/11	17-41	11


can anyone confirm those values? just on the offchance there's a mistake in my
book. now it strikes me that rather than do the expensive operation

float a,b,c,d,x,y;

c=x/b;
d=y/b;

i should attempt to precalculate 1/b (especially if it's used more than once) and
do

a=1/b;
c=x*a;
d=y*a;

which would save a whole load of cycles, particularly on a pentium.
in fact, if i were doing the operations with signed longs instead...

signed long a,b,c,d,x,y;

i would be better writing - (and changing a to a float)

a=1.0/b;	(because fdiv is still faster than idiv in most cases)
c=(float)x*a;
d=(float)y*a;

ie. to change the integers into floating wherever possible to make use of the
fmul timings, which outstrip every other timing even in worst case!


so there must be a catch somewhere of course ;)
perhaps the changing from float->int and vice versa takes a lot of time?
anyone?

nik


In article <32e22337 DOT 2066519 AT ursa DOT smsu DOT edu>, aho450s AT nic DOT smsu DOT edu (Tony O'Bryan) writes:
|> On 18 Jan 1997 20:50:22 GMT, gpt20 AT thor DOT cam DOT ac DOT uk (G.P. Tootell) wrote:
|> 
|> >while using the profiler on some code i had written i noticed that changing a
|> >floating point multiply to an unsigned multiply of 2 longs turned out to be
|> >slower. in fact floating point multiply appears to be faster than ordinary
|> >integer multiply for any case. is this actually true? and if so is there any
|> >reason i shouldn't just change every multiply in my code to make sure it's
|> >floating point?
|> 
|> My timing manual only goes up to the 486.  The FMUL takes from 11 to 16 clocks,
|> where the 486 takes 13 to 42 clocks.

--