delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/08/09/20:09:08

Message-Id: <m0wxLTg-0007wqC@fwd01.btx.dtag.de>
Date: Sun, 10 Aug 97 02:03 MET DST
To: janko DOT heilgeist AT gmx DOT net, djgpp AT delorie DOT com
References: <01bca43a$76d41440$39a340c2 AT pjheilg DOT goslar DOT netsurf DOT de>
Subject: Re: Compiled faster rle sprites in Allegro?!
MIME-Version: 1.0
From: Georg DOT Kolling AT t-online DOT de (Georg Kolling)

Janko Heilgeist schrieb:
> Hi,
> we tried to optimize our tile based game engine by using compiled sprites
> instead of rle sprites.
>
> Result: 30000 tiles need 21 seconds with compiled sprites (20 seconds with
> rle sprites)
>
> We used the sprite routines draw_rle_sprite and draw_compiled_sprite of
> Allegro 2.2
>
> Why?????????????????????

Easy answer: rle means run length encoding, so if you have 40 pixels of the
same color in a row, they're represented as two bytes, a counter and a color
byte. In this case, 40 pixels can be written in one chunk, which is much faster
than writing every single pixel, like compiled sprites do. Usually, compiled
sprites are faster, but if you use only few colors, rle sprites could be faster

BTW, I don't believe that pentium processor cache makes rle sprites faster than 
rle sprites, like someone answered to this question. Why? 
There are three reasons:
1. A pentium has 8 KB internal code cache that continually reads code from RAM
   Usually, 4 KB of this code cache have been computed, 4 KB will be computed
   as long as there are no jumps. Since compiled sprites do not need jumps,
   cache can't read 'wrong' code.
2. Since there are no jumps, branch prediction can't guess wrong. When you use
   rle sprites, there are jumps, so branch prediction can waste some cycles
3. a mov instruction with immediate byte value and a memory address needs 4 
   bytes, ie a 32x32 sprite fits into 4 KB of cache

30000 tiles in 20 seconds? Either you use huge tiles or a slow computer, 
I guess...

nce bit (for virtual
memory)...If anyone knows it: please share your knowledge with me...

The next byte is a bit ;-) difficult:
bits 0-3 are bits 16-19 of the segment size. But now, segment size is 
20 bits, that means at most 1048576 bytes (1 MB).
bit 7 makes you able to get bigger segments. Set it to 1, and the segment size
will represent the number of 4-KB-'pages' in the segment, so the maximal size
is 4 KB * 1048576 = 4 GB

And finally, the last byte represents bits 24-31 of the base address

Very weird, but that's for 286-compatiblity...

Hope this helps... 

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019