Mail Archives: djgpp/1996/07/12/18:45:10
In article <4s4sue$qbc AT status DOT gen DOT nz>,
Bruce Foley <brucef AT central DOT co DOT nz> wrote:
>Tom Wheeley <tomw AT tsys DOT demon DOT co DOT uk> wrote:
>
>
>>Although I've never used setpixel routines, I was always under the impression
>>that (x + y << 8 + y << 6) is faster than (x + 320 * y).
>
>I think this is true of older processors, but on a 486, a
>well designed mul instruction is just as fast (or faster?),
>depending on the value of the operands.
>Don't know about the Pentium though, since simple
>instructions can be useful for keeping both Pipes going.
mul and imul are multicycle unpairable instructions on the pentium.
They execute in the FP unit. The only hard cycle count number I
have is "imul eax,217" takes 10 cycles. The number of cycles is
variable (I think).
According to the intel documentation the break even point is 8 or
fewer bits set in the pentium, for 486 it's 6 or fewer bits set.
So on both processors, a multiply by 320 is always best when implemented
as shift and add.
Of course if you are using C, (x+320*y) should compile the same as
(x+y<<8+y<<6). (On a properly designed compiler).
Eric
--
Eric Korpela | An object at rest can never be
korpela AT ssl DOT berkeley DOT edu | stopped.
<a href="http://www.cs.indiana.edu/finger/mofo.ssl.berkeley.edu/korpela/w">
Click here for more info.</a>
- Raw text -