Mail Archives: djgpp/1997/11/29/15:02:35
Tom Robinson wrote:
> Quick note:
>
> _farpokeb(_dos_ds, 0xA0000+(y<<8)+(y<<6)+x, c);
>
> is a faster way of doing it...
i wouldn't try to outsmart the compiler's optimization on such a mundane
operation as multiplication. did you ever compile the two versions with
optimizations turned on and look at the assembler output? i recommend
doing it.
>
> >That requires an odd header file, it might be <sys\farptr.h>, but you
> >should look it up in LIBC.INF as well. That's the second fastest way
> > I know how to. (The fastest way I know how, requires disabling
> > protected memory, and is more complicated to get started)
please mention the drawbacks of using near pointers, too. remember, you
are supposedly helping a newbie. let people learn the proper (in a lot
of environments), more portale way of doing things first.
take a look at the test program below. the results when i run it under
windows 95 on dx-4/75 with 16 mb (with recompilations in betweeen tests
to get rid of the contents of the cpu cache) are:
multiply shift
-----------------------------------------
with no optimization: | 127 129
with -O | 46 47
with -O2 | 56 57
with -O3 | 19 22
- times are measure using time()
- 1000x320x200 putpixel operations
- ATI Mach 64 CX with 2Mb DRAM
now, i also tried switching around the calls to the two versions. the
results were unchanged.
i do not claim to this to be the perfect test or anything. i am just
unable to see an overwhelming advantage to using shifts as opposed to a
straightforward multiply which also makes sense to a newbie who just
knows that the screen resolution is 320x200.
quite clearly, fruitful optimizations in putpixel routines lie in
judicuous use of _farns functions, converting 8-bit writes to 32-bit
write where appropriate, and above all, using movedata for 'blits'. for
example, after changing f1 and f1 in the following test routine to
_farns equivalents yielded between 25% to 30% speed increases in the no
optimization and O3 cases. again, i am not claiming that the numbers are
extremely accurate or anything, but they give information on relative
magnitudes. my sole point is that providing this "great" optimization to
a newbie is counter productive. the only real optimization method i know
of is think, measure, think, measure ...
finally, moving the calculation of the index to the buffer to the outer
loop rather than leaving it in the function call caused a 25% speed-up
in no optimization, and 5% speed-up in the O3 version.
here is the test (in my case it was compiled with
gcc xp.c -o xp.exe -Wall -DITERATIONS=1000
and the appropriate optimization switch)
/* the following code is for informational purposes
* you can do whatever you want with it, so long as
* you understand that there are no explicit or
* implicit warranties. if you fry your computer
* while running it, you are on your own.
*/
#include <sys/farptr.h>
#include <go32.h>
#include <dpmi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#define XRES 320
#define YRES 200
#define VGA13 0x13
#define CO80 0x03
void f1(int x, int y, unsigned char c)
{
_farpokeb(_dos_ds, 0xa0000 + 320*y + x, c);
return;
}
void f2(int x, int y, unsigned char c)
{
_farpokeb(_dos_ds, 0xa0000 + (y<<8)+(y<<6)+x, c);
return;
}
int set_gr_mode(int mode)
{
__dpmi_regs r;
memset(&r, 0, sizeof(r));
r.x.ax = mode;
return ( __dpmi_int(0x10, &r) ? -1 : mode );
}
unsigned char s[YRES*XRES];
int main(void)
{
int i, x, y;
time_t t1, t2;
set_gr_mode(VGA13);
for(y=0; y<YRES; y++)
for(x=0; x<XRES; x++)
s[y*YRES + x] = (unsigned char)(256.0*(rand()/(RAND_MAX+1.0)));
time(&t1);
for(i=0; i<ITERATIONS; i++)
for(y=0; y<YRES; y++)
for(x=0; x<XRES; x++)
f1(x, y, s[y*YRES+x]);
t1 = time(NULL) - t1;
time(&t2);
for(i=0; i<ITERATIONS; i++)
for(y=0; y<YRES; y++)
for(x=0; x<XRES; x++)
f2(x, y, s[y*YRES+x]);
t2 = time(NULL) - t2;
set_gr_mode(CO80);
printf("f1: %d\nf2: %d\n", t1, t2);
return 0;
}
--
----------------------------------------------------------------------
A. Sinan Unur
Department of Policy Analysis and Management, College of Human Ecology,
Cornell University, Ithaca, NY 14853, USA
mailto:sinan DOT unur AT cornell DOT edu
http://www.people.cornell.edu/pages/asu1/
- Raw text -