delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1999/06/17/20:50:18

From: gathers AT cyberdude DOT com (Gathers)
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Why dosn't my asm getpixel() work?
Organization: Green Dragon
Message-ID: <376981c9.8709864@nntpserver.swip.net>
References: <3766f436 DOT 5083819 AT nntpserver DOT swip DOT net> <3768394B DOT 4C9E319E AT cartsys DOT com>
X-Newsreader: Forte Agent 1.0/32.354
MIME-Version: 1.0
Lines: 132
Date: Thu, 17 Jun 1999 23:49:57 GMT
NNTP-Posting-Host: 130.244.97.29
X-Complaints-To: news-abuse AT swip DOT net
X-Trace: nntpserver.swip.net 929663696 130.244.97.29 (Fri, 18 Jun 1999 01:54:56 MET DST)
NNTP-Posting-Date: Fri, 18 Jun 1999 01:54:56 MET DST
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

On Wed, 16 Jun 1999 16:54:51 -0700, Nate Eldredge <nate AT cartsys DOT com>
wrote:

>lodsb acts relative to ds, not es.  You should load your segment into
>%ds.
>
>A few other problems:
>
>1. You need to restore es and ds after changing them.  GCC expects them
>not to change.  Adding them to the clobber list will NOT work, so either
>push/pop or use another register.
>
>2. You don't tell the compiler that %si or %di are being clobbered.
>
Hey, now it works until I turn on optimizion :)


>And some efficiency issues, if you're interested:
>
>3. It's pointless to zero a register before overwriting it.
>
I know, but I was just adding that while trying to locate the error,
to make sure nothing was left in the high part.


>4. This code will run faster if you use the 32-bit registers and
>instructions.  In protected mode, there is a 1-cycle penalty on each
>16-bit instruction.
>
But then I'll have to use a long int instead of short? Will it still
be faster?


>5. It's probably simpler to use an indirect move instead of stosb/lodsb.
>
And how do I do that? :)


>6. The multiply can be optimized better; this is left as an exercise for
>the reader.
>
mov ax,y
mov bx,ax
shl ax,8
shl bx,6
add ax,bx
ax==y*320?


>In fact, you could let the compiler do all this:
>
>#include <sys/farptr.h>
>#define mygetpixel(seg, add, x, y) (_farpeekb((seg), (add) + (x) + ((y)
>* 320)))
>#define myputpixel(seg, add, x, y, c) (_farpokeb((seg), (add) + (x) +
>((y) * 320), (c)))
>
>The quality of the code, if optimization is on (you can find it by using
>-S), may surprise you.
>
Yes, but I don't want to replace the pixel functions, I just want the
asm code for it so I can make asm of my blur funtion, then speeding it
up is the last step. I think even I could manage to speed it up if I
just could get it working.. :)

Anyway, now the code works, even with -O3, unless I use both functions
at the same time..then it only works without optimizion. I test with
this loop
for(x=0;x<320;x++)
   for(y=0;y<200;y++){
      myputpixel(screenseg,screenadd,x,y,5);
//      _putpixel(screen,x,y,5);
//      if(_getpixel(screen,x,y)!=5){
      if(mygetpixel(screenseg,screenadd,x,y)!=5){
         textprintf(screen,font,20,20,55,"x=%d y=%d",x,y);
         while (!keypressed()) {}
         exit(0);
      }
   }
Here is the code pieces again, and I'd be really grateful if someone
could point out the problem :)

screenseg=screen->seg;
screenadd=bmp_write_line(screen,0);

unsigned char mygetpixel(unsigned short seg,unsigned long add,unsigned
short x,unsigned short y)
{
unsigned char c;
asm("push %%ds\n\t"
    "movw %1,%%ax\n\t"
    "movw %%ax,%%ds\n\t"
    "movw %2,%%ax\n\t"
    "xor %%bx,%%bx\n\t"
    "movw $0x140,%%bx\n\t"
    "mul %%bx\n\t"
    "addl %3,%%ax\n\t"
    "addw %4,%%ax\n\t"
    "movl %%ax,%%si\n\t"
    "lodsb\n\t"
    "movb %%al,%0\n\t"
    "pop %%ds"
    :"g="(c)
    :"g"(seg),"g"(y),"g"(add),"g"(x)
    :"ax","bx","si","memory"
);
return c;
}

void myputpixel(unsigned short seg,unsigned long add,unsigned short
x,unsigned short y,unsigned char c)
{
asm("push %%es\n\t"
    "movw %0,%%ax\n\t"
    "movw %%ax,%%es\n\t"
    "movw %1,%%ax\n\t"
    "xor %%bx,%%bx\n\t"
    "movw $0x140,%%bx\n\t"
    "mul %%bx\n\t"
    "addl %2,%%ax\n\t"
    "addw %3,%%ax\n\t"
    "movl %%ax,%%di\n\t"
    "movb %4,%%al\n\t"
    "stosb\n\t"
    "pop %%es"
    :
    :"g"(seg),"g"(y),"g"(add),"g"(x),"g"(c)
    :"ax","bx","di","memory"
);
}

/Gathers

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019