delorie.com/archives/browse.cgi | search |
From: | leathm AT solwarra DOT gbrmpa DOT gov DOT au (Leath Muller) |
Message-Id: | <199612261157.VAA25999@solwarra.gbrmpa.gov.au> |
Subject: | Re: Is DJGPP that efficient? |
To: | junk AT defeating DOT email DOT address |
Date: | Thu, 26 Dec 1996 21:57:01 +1000 (EST) |
Cc: | djgpp AT delorie DOT com |
In-Reply-To: | <aJQDsHAZs4uyEwDL@chocolat.foobar.co.uk> from "Paul Shirley" at Dec 21, 96 07:00:41 am |
> The P5 has a 3 clk latency (the time it takes from issue to retiring an > op), a throughput (the time before another op can be issued) of 1 clk > *unless* you issue consecutive multiplies when is has a 2 clk > throughput. I thought that was then Pentium Pro which could only perform a fmul every other cycle and the pentium could keep going every cycle. I will check this... :) > AFAIK you can achieve a maximum multiply throughput of 2clks/mul. > However in real code you have to actually load the next operand or sum > the result which eats up that otherwise wasted cycle. The gcc fpu code > is actually pretty good. If you use fxch, you should be able to get a much better throughput than gcc can provide... Leathal.
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |