From: baldo AT luna DOT internet DOT com DOT uy Message-Id: <3.0.1.32.19971229014358.0069b790@mail.internet.com.uy> Date: Mon, 29 Dec 1997 01:43:58 -0300 To: dave DOT nugent AT ns DOT sympatico DOT ca, djgpp AT delorie DOT com Subject: Re: Help with optimizing for speed In-Reply-To: <34A40538.F19@ns.sympatico.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Precedence: bulk Hello! At 11:27 AM 26/12/1997 -0800, you wrote: >Hello, can anyone tell me if they see a way to optimize this code at >all. >I am trying to write a scrolling style game. Nothing fancy. >I have a large buffer set up around 1.2MB called screen_hold that holds >the >entire level (draw). I then copy 160 lines*320 bytes of this to a >secondary >buffer called screen where I will then add sprites to the background and >blast to vga memory screen mode13. I am using 160 lines, because the >bottom >40 lines will be used for a score bar & other info that will not always >need >to be redrawn constantly. I am just new to DJGPP and have been using >Borland >C++ v3.0 for DOS up til now, but figured I could really increase the >speed >with a 32 bit compiler, but with the code I am using, there is not much >difference (speed wise) between the code generated by DJGPP and >Borland's >Turbo C++ 16bit code. Is there a way that would be faster? I'm trying >to get speed similar to that in Jazz JackRabbit.. I can't think of a >faster >way than this in 32 bit!! DJGPP must be definitely faster. And I have a question: how you have done to manipulate 1.2mb of memory in Borland? > >// xoxoxoxoxoxo Snipped code... xoxoxoxoxoxoxoxoxoxoxoxox >for(loop1=0;loop1<160;loop1++) >memcpy(screen+loop1*320,screen_hold+offset+loop1*3200,320); Try to do it in this manner: for(loop1=159; loop1>=0; loop1--) memcpy(screen+loop1*320,screen_hold+offset+loop1*3200,320); Try to do all the loops in this manner, from a number to zero or in an inverse form (decrecient --), it optimizes a little... Also, (not apearing in this case but..) you can optimize this class of code: for(Y=123; Y>=0; Y--) { for(X=321; X>=0; X--) { BUFFER[X][Y]=A_VALUE; } } changing it in this manner: for(X=321; X>=0; X--) { for(Y=123; Y>=0; Y--) { BUFFER[X][Y]=A_VALUE; } } It is faster to access the last index of an array first and then all the others in order. This are some of the little optimizations I know. There are others but you have to figure it out because are type of code dependent. >// Now send the screen buffer to VGA Memory.. >_dosmemputl(screen, 16000, 0xa0000); // Send "screen" buffer to VGA >MEMORY > >// Ok.. firstly I know I can use shifts << for the multip's. I just >wrote >it this way to make easier to understand. DJGPP and many compilers optimizes it for you, dont worry. >I am copyiny 160 lines of 320 bytes from screen_hold to screen buffers >the offset and 3200 are related to where the screen is located in the >buffer screen_hold. > Goodbye! HTH! Ivan Baldo: E-Mail: baldo AT internet DOT com DOT uy. Alternate E-Mail: ibaldo AT usa DOT net. Another alternate E-Mail: lubaldo AT adinet DOT com DOT uy. Web page: http://xoom.com/baldo. Phone: (598) (2) 613 3223. Caldas 1781, Malvin, Montevideo, Uruguay, South America. My WEB page: - Icd (a fast fuzzy directory changer for DOS, Freeware compiled with DJGPP and with full source code). - Some other silly thinks... - English (new) and Spanish languages.