delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/06/11/04:32:49

Newsgroups: comp.os.msdos.djgpp
From: tob AT world DOT std DOT com
Subject: Re: Allegro & sprite stretching optimization
Message-ID: <EBGuzo.C5w@world.std.com>
Sender: tob AT world DOT std DOT com (Tom Breton)
Reply-To: tob AT world DOT std DOT com
Organization: BREnterprises
References: <0ufufgARsrmzEwwy AT talula DOT demon DOT co DOT uk>
Date: Sun, 8 Jun 1997 16:54:59 GMT
Lines: 82
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

Shawn Hargreaves <Shawn AT talula DOT demon DOT co DOT uk> writes:
> [He who is too great to require attribution }:) writes:]
> >    There are enough free variables for every stretcher that it still
> >    will probably take a while to search the cache. I introduced a hash
>
> How big is this cache?

I've coded it to be variable. I'm defaulting to 10 entries.

> I can only see 6 parameters, and two of those
> (mode-X vs. linear and masked vs. solid) can be combined into a
> bitfield. The dest_x and dest_width could fit happily into 16 bits each,
> which would leave only 4 ints to be tested. Using a cache of say four
> entries, with the most recently used at the top, I can't believe that
> would be a significant expense!

I've fully coded and tested both my designs now, but because my
application only uses 1 particular scaling all I can say about speed is:
Yes, it speeds up a lot, and there's no discernable speed difference
between my more and less ambitious design, but that's almost certainly
because there's just one scaling.

Your idea of compacting the representation is interesting, but I did a
different thing to speed it up that makes compacting less of an issue.
As I mentioned, I added a hash field to the key structure to speed the
comparison up. This means:

The first thing tested per entry is the hash field which usually aborts
all the other tests so the number of other field-tests required no
longer multiply by the number of entries.

All fields have to be hashed once per search.

The full comparison is done on average something like 1.000001 times per
search.

The cost of operations required to compact the representation must be
weighed against the savings for the full search and for quicker (because
start out with fewer ints) hashing. Your idea probably slightly wins.

The hash algorithm I used is totally off the cuff and is probably the
first thing to modify if more speed is needed.


Whether the lookup would be a significant expense depends (naturally) on
how often the stretching is used. I want to point out that specifically
with stretch_sprite, it's an issue, because it's natural to use that to
display a lot of little bitmaps.


> >    It wasn't an issue when compilations and uses were 1-to-1, but on
> >    separating them there are a couple of issues WRT how precise a match
> >    a user will demand. For example, I can easily imagine that a user,
> >    stretching a sprite, would not care whether the stretch began on the
> >    correct fractional pixel. In other cases, I can imagine a user might
>
> I can't help thinking that anyone who needs that degree of control over
> something is best off hacking the sources to fit their needs...
>
> I take your point that a caching algorithm is never going to handle
> every possible situation as efficiently as exposing the implementation
> details for the user to control, I still think it's a much better
> solution. For starters, it will speed up all programs, regardless of
> whether the coder knows/understands/cares how the function works. More
> importantly, it doesn't place any restrictions on future developments of
> Allegro, and it avoids muddying the API with messy implementation
> details. I know that sounds like something stuffy out of a "good
> programming guidelines" textbook, but hey, this stuff really does
> matter! :-)

Yeah, I do have a chronic blindspot with overambitious designs, so
you're probably right.

Anyways, I coded but didn't test a design that exposes an intermediate
representation type, and function to make the intermediate rep, and a
function to use that rep. I'll make that available too, just in case
anybody needs to go even faster. I assume from your comments that you'll
largely ignore that file (stretch1.cc), though I don't think they're as
messy as you expect.

        Tom

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019