Mail Archives: djgpp/1999/06/17/20:50:18
On Wed, 16 Jun 1999 16:54:51 -0700, Nate Eldredge <nate AT cartsys DOT com>
wrote:
>lodsb acts relative to ds, not es. You should load your segment into
>%ds.
>
>A few other problems:
>
>1. You need to restore es and ds after changing them. GCC expects them
>not to change. Adding them to the clobber list will NOT work, so either
>push/pop or use another register.
>
>2. You don't tell the compiler that %si or %di are being clobbered.
>
Hey, now it works until I turn on optimizion :)
>And some efficiency issues, if you're interested:
>
>3. It's pointless to zero a register before overwriting it.
>
I know, but I was just adding that while trying to locate the error,
to make sure nothing was left in the high part.
>4. This code will run faster if you use the 32-bit registers and
>instructions. In protected mode, there is a 1-cycle penalty on each
>16-bit instruction.
>
But then I'll have to use a long int instead of short? Will it still
be faster?
>5. It's probably simpler to use an indirect move instead of stosb/lodsb.
>
And how do I do that? :)
>6. The multiply can be optimized better; this is left as an exercise for
>the reader.
>
mov ax,y
mov bx,ax
shl ax,8
shl bx,6
add ax,bx
ax==y*320?
>In fact, you could let the compiler do all this:
>
>#include <sys/farptr.h>
>#define mygetpixel(seg, add, x, y) (_farpeekb((seg), (add) + (x) + ((y)
>* 320)))
>#define myputpixel(seg, add, x, y, c) (_farpokeb((seg), (add) + (x) +
>((y) * 320), (c)))
>
>The quality of the code, if optimization is on (you can find it by using
>-S), may surprise you.
>
Yes, but I don't want to replace the pixel functions, I just want the
asm code for it so I can make asm of my blur funtion, then speeding it
up is the last step. I think even I could manage to speed it up if I
just could get it working.. :)
Anyway, now the code works, even with -O3, unless I use both functions
at the same time..then it only works without optimizion. I test with
this loop
for(x=0;x<320;x++)
for(y=0;y<200;y++){
myputpixel(screenseg,screenadd,x,y,5);
// _putpixel(screen,x,y,5);
// if(_getpixel(screen,x,y)!=5){
if(mygetpixel(screenseg,screenadd,x,y)!=5){
textprintf(screen,font,20,20,55,"x=%d y=%d",x,y);
while (!keypressed()) {}
exit(0);
}
}
Here is the code pieces again, and I'd be really grateful if someone
could point out the problem :)
screenseg=screen->seg;
screenadd=bmp_write_line(screen,0);
unsigned char mygetpixel(unsigned short seg,unsigned long add,unsigned
short x,unsigned short y)
{
unsigned char c;
asm("push %%ds\n\t"
"movw %1,%%ax\n\t"
"movw %%ax,%%ds\n\t"
"movw %2,%%ax\n\t"
"xor %%bx,%%bx\n\t"
"movw $0x140,%%bx\n\t"
"mul %%bx\n\t"
"addl %3,%%ax\n\t"
"addw %4,%%ax\n\t"
"movl %%ax,%%si\n\t"
"lodsb\n\t"
"movb %%al,%0\n\t"
"pop %%ds"
:"g="(c)
:"g"(seg),"g"(y),"g"(add),"g"(x)
:"ax","bx","si","memory"
);
return c;
}
void myputpixel(unsigned short seg,unsigned long add,unsigned short
x,unsigned short y,unsigned char c)
{
asm("push %%es\n\t"
"movw %0,%%ax\n\t"
"movw %%ax,%%es\n\t"
"movw %1,%%ax\n\t"
"xor %%bx,%%bx\n\t"
"movw $0x140,%%bx\n\t"
"mul %%bx\n\t"
"addl %2,%%ax\n\t"
"addw %3,%%ax\n\t"
"movl %%ax,%%di\n\t"
"movb %4,%%al\n\t"
"stosb\n\t"
"pop %%es"
:
:"g"(seg),"g"(y),"g"(add),"g"(x),"g"(c)
:"ax","bx","di","memory"
);
}
/Gathers
- Raw text -