delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/08/12/06:34:42

From: beppu AT rigel DOT oac DOT uci DOT edu (John Beppu)
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Optimization and bug smashing.. a lot of other questions too :)
Date: 11 Aug 1997 20:50:38 GMT
Organization: University of California, Irvine
Lines: 103
Message-ID: <5sntuu$99n@news.service.uci.edu>
References: <33ee3f7f DOT 4973504 AT news DOT inlink DOT com>
NNTP-Posting-Host: rigel.oac.uci.edu
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

In article <33ee3f7f DOT 4973504 AT news DOT inlink DOT com>,
[vecna] <vecna AT inlink DOT com> wrote:

>Okay, this is the single most important routine to optimize. It's the
>transparent blitter. It's very important to optimize already, and it
>will get VERY important to optimize in the next version... EVERY CYCLE
>COUNTS in this one. 

    Recently, I reverse engineered the transparent sprite format used
    in the Win95 port of the game Samurai Spirits 2 (the classic 
    Neo*Geo game).  They used an interesting technique which did not
    involve checking every byte for the "transparent value" like 
    you are doing.  

    The sprites were restructured so that a routine could be written
    to skip over anything that might be transparent without performing
    a costly comparison.  A sprite was made up of variable sized
    structures I call "Scan Lines".  In turn, scan lines are made up of
    0 or more variable sized structures I call "line segments".  The
    following describes the format of these two structures.

    ; Line Segment

    byte 0	horizontal offset (# of pixels to /not/ draw)
    byte 1	# of pixels to draw
    byte 2+	pixel data

    ; Scan Line

    word 0	# of bytes occupied by all data for this
    		scan line (all line segments + the 2 bytes
		occupied by this one word header).
    byte 2+	Line Segment data.

    The hierarchy of their sprite format is:

    Sprite
      Scan Line
        Line Segment

    Hopefully, my description was not too vague.

    This might be something you want to consider.  You'll have to
    write a routine that takes a rectangular sprite and makes a
    transparent sprite out of it.  Then you can make your highly
    optimized:
    
    void tcopysprite(tSprite* t);
                                                            let's try to
                                                            minimize the
                                                          number of push
                                                        instructions gcc
                                                              generates.  
    
                                                                     ^_^

    You might want to plan for the future a little by not hard
    coding constants (such as the virtual buffer width, 352),
    because who knows--some day you may want to use a larger
    resolution.  It would be a shame to let your Assembly go
    to waste if that day were to come.

>tcopysprite(int x, int y, int width, int height, char *spr)
>{ asm("movl %3, %%ecx                   \n\t"
>      "movl %4, %%esi                   \n\t"
>"tcsl0:                                 \n\t"
>      "movl %1, %%eax                   \n\t"
>      "imul $352, %%eax                 \n\t"
>      "addl %0, %%eax                   \n\t"
>      "addl _virscr, %%eax              \n\t"
>      "movl %%eax, %%edi                \n\t"
>      "movl %2, %%edx                   \n\t"
>"drawloop:                              \n\t"
>      "lodsb                            \n\t"
>      "orb %%al, %%al                   \n\t"
>      "jz nodraw                        \n\t"
>      "stosb                            \n\t"
>      "decl %%edx                       \n\t"
>      "orl %%edx, %%edx                 \n\t"
>      "jz endline                       \n\t"
>      "jmp drawloop                     \n\t"
>"nodraw:                                \n\t"
>      "incl %%edi                       \n\t"
>      "decl %%edx                       \n\t"
>      "orl %%edx, %%edx                 \n\t"
>      "jnz drawloop                     \n\t"
>"endline:                               \n\t"
>      "incl %1                          \n\t"
>      "decl %%ecx                       \n\t"
>      "jnz tcsl0                        \n\t"
>      :
>      : "m" (x), "m" (y), "m" (width), "m" (height), "m" (spr)
>      : "eax","edx","esi","edi","ecx","cc" );
>}

    (note:  I'm a Linux user, and the only windows I have on my
            machine is X windows.  The files for the SS2 game
            were sent to me by a friend who asked me to rip the
            sprites from the game.)


-- 
  beppu AT uci DOT edu .............................................................

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019