From: Phil Galbiati <Philip DOT S DOT Galbiati AT Tek DOT com>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: asm opcode speeds
Date: Tue, 29 Oct 1996 17:03:32 -0800
Organization: Tektronix
Lines: 55
Message-ID: <3276A964.64A3@Tek.com>
References: <5523m3$ss5 AT the-fly DOT zip DOT com DOT au>
NNTP-Posting-Host: philipga.cse.tek.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
To: Dean <deanh AT zip DOT com DOT au>
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

Dean wrote:
> 
> Does anyone know where I can get the speeds of different opcodes for
> assembly.  In particular, I'm wonder how fast my plot_pixel routine
> is:
> 
> void plot_pixel( int x, int y, unsigned char colour, short where ) {
>   __asm__ ("

<SNIP>

<DISCLAIMER> I don't grok assembly, </DISCLAIMER> but I think that
you need a little more context to evaluate the speed of your pixel
plotting function.

If you are running on a machine with a cache, the speed of your
function is going to depend on whether or not the code for the
function is in the cache or not.  Similarly, if your program uses
more virtual memory than you have physical memory to contain it,
the speed of your function will depend on whether or not it has to
be swapped in from disk.

Also, it appears that your function plots pixels one at a time,
so to plot two pixels requires two function calls.  If this is the
case, I would guess that the bulk of the time spent in calling your
function (assuming the cache is warm) will be spent as function call
overhead, rather than in plotting the pixels.

Conclusion: you are optimizing the wrong part of your program.

Amdahl's Law [poorly paraphrased] says that if you try to speed up a
program by optimizing only part of it, the most you can speed it up
is the fraction of time spent in the part you are optimizing.

So if 20% of the execution time of your function is attributable to
the few assembly instructions needed to plot the pixel, then the
shortest execution time you can hope to achieve by optimizing those
instructions is no better than 80% of the original (unoptimized) time.

I *suspect* that your assembly instructions account for significantly
*less* than 20% of the execution time [but that's an uneducated guess],
so optimizing them will speed you up even less.  The way to speed up
your function is by eliminating some (or all) of the function calls,
either by in-lining it, or by plotting multiple pixels per call.

For more on Amdahl's Law, get hold of a good computer architecture
book, like _Computer_Architecture:_A_Quantitative_Approach_ by
Hennessy & Patterson.

Hope this helps
--Phil Galbiati
=================================================
     Any opinions expressed here reflect the
     ignorance of the author, NOT Tektronix.
=================================================