From: beppu AT rigel DOT oac DOT uci DOT edu (John Beppu) Newsgroups: comp.os.msdos.djgpp Subject: Re: Optimization and bug smashing.. a lot of other questions too :) Date: 11 Aug 1997 20:50:38 GMT Organization: University of California, Irvine Lines: 103 Message-ID: <5sntuu$99n@news.service.uci.edu> References: <33ee3f7f DOT 4973504 AT news DOT inlink DOT com> NNTP-Posting-Host: rigel.oac.uci.edu To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Precedence: bulk In article <33ee3f7f DOT 4973504 AT news DOT inlink DOT com>, [vecna] wrote: >Okay, this is the single most important routine to optimize. It's the >transparent blitter. It's very important to optimize already, and it >will get VERY important to optimize in the next version... EVERY CYCLE >COUNTS in this one. Recently, I reverse engineered the transparent sprite format used in the Win95 port of the game Samurai Spirits 2 (the classic Neo*Geo game). They used an interesting technique which did not involve checking every byte for the "transparent value" like you are doing. The sprites were restructured so that a routine could be written to skip over anything that might be transparent without performing a costly comparison. A sprite was made up of variable sized structures I call "Scan Lines". In turn, scan lines are made up of 0 or more variable sized structures I call "line segments". The following describes the format of these two structures. ; Line Segment byte 0 horizontal offset (# of pixels to /not/ draw) byte 1 # of pixels to draw byte 2+ pixel data ; Scan Line word 0 # of bytes occupied by all data for this scan line (all line segments + the 2 bytes occupied by this one word header). byte 2+ Line Segment data. The hierarchy of their sprite format is: Sprite Scan Line Line Segment Hopefully, my description was not too vague. This might be something you want to consider. You'll have to write a routine that takes a rectangular sprite and makes a transparent sprite out of it. Then you can make your highly optimized: void tcopysprite(tSprite* t); let's try to minimize the number of push instructions gcc generates. ^_^ You might want to plan for the future a little by not hard coding constants (such as the virtual buffer width, 352), because who knows--some day you may want to use a larger resolution. It would be a shame to let your Assembly go to waste if that day were to come. >tcopysprite(int x, int y, int width, int height, char *spr) >{ asm("movl %3, %%ecx \n\t" > "movl %4, %%esi \n\t" >"tcsl0: \n\t" > "movl %1, %%eax \n\t" > "imul $352, %%eax \n\t" > "addl %0, %%eax \n\t" > "addl _virscr, %%eax \n\t" > "movl %%eax, %%edi \n\t" > "movl %2, %%edx \n\t" >"drawloop: \n\t" > "lodsb \n\t" > "orb %%al, %%al \n\t" > "jz nodraw \n\t" > "stosb \n\t" > "decl %%edx \n\t" > "orl %%edx, %%edx \n\t" > "jz endline \n\t" > "jmp drawloop \n\t" >"nodraw: \n\t" > "incl %%edi \n\t" > "decl %%edx \n\t" > "orl %%edx, %%edx \n\t" > "jnz drawloop \n\t" >"endline: \n\t" > "incl %1 \n\t" > "decl %%ecx \n\t" > "jnz tcsl0 \n\t" > : > : "m" (x), "m" (y), "m" (width), "m" (height), "m" (spr) > : "eax","edx","esi","edi","ecx","cc" ); >} (note: I'm a Linux user, and the only windows I have on my machine is X windows. The files for the SS2 game were sent to me by a friend who asked me to rip the sprites from the game.) -- beppu AT uci DOT edu .............................................................