Message-Id: <m0ylxKP-000S41C@inti.gov.ar>
Comments: Authenticated sender is <salvador AT natacha DOT inti DOT gov DOT ar>
From: "Salvador Eduardo Tropea (SET)" <salvador AT inti DOT gov DOT ar>
Organization: INTI
To: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>, djgpp AT delorie DOT com,
        lubaldo AT adinet DOT com DOT uy
Date: Tue, 16 Jun 1998 12:11:59 +0000
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: 64k demo
References: <3 DOT 0 DOT 1 DOT 32 DOT 19980616011946 DOT 006a2bcc AT adinet DOT com DOT uy>
In-reply-to: <Pine.SUN.3.91.980616093058.12747I-100000@is>
Precedence: bulk

Eli Zaretskii <eliz AT is DOT elta DOT co DOT il> wrote:

> On Tue, 16 Jun 1998 lubaldo AT adinet DOT com DOT uy wrote:
> 
> > 	I am not sure why is this, but maybe because GCC knows the sizes of the
> > global static array and the offset in the data segment, and in a lot of
> > cases it can directly access this array without doing arithmetic to find
> > the address. In the 2nd method it cannot know the address of the array of
> > pointers and the pointer so it has to find the pointer first (adding to the
> > first pointer the value of the line we request) and then it has to add the
> > value of this pointer to the column we say. In the 3rd method it knows
> > where is the pointer array but not the 1d array, so it does not have to do
> > the addition to find the pointer but the addition of the value of the
> > pointer and the column we want.
> > 	So, my question is if this is true and if it is not true, why is this?
> 
> I doubt the above is true.  

I'm sure is true ;-). You'll see why.

> At the very least, you should look at the 
> code emitted by the compiler (gcc -S) and show that it indeed generates 
> different code in these cases, and that the differences can indeed 
> justify the speed variation you observed.

Yes that's the best thing but Ivan doesn't know assembler so isn't a big help 
for he.
 
> It is much more probable that changing the implementation caused you to 
> change your C source in a subtle way that overflowed the processor's data 
> cache, in which case you should see a twofold performance penalty.

Isn't the case he is using small sprites.
 
> > 	My recommendation: for multidimensional arrays try to use them static if
> > you can. For 1d arrays I don't know, but it seems that 1d arrays also be
> > faster if they are static since GCC know the address directly and don't
> > have to add the pointer plus the number of the element we want.
> 
> This shouldn't be a problem, since GCC will put the this address in a 
> register, and won't access memory more than once.

Here is the problem:

Intel processors have only few registers, that's too sad but is the fact. Small 
loops sometimes uses ALL the registers so 1 more register can do a BIG 
difference in performance. Using static arrays you save registers.
Additionally gcc is pretty stupid when optimizing some loops, looks like it 
works OK for RISC processors with 64 registers but is an idiot with Intel 
processors. I saw it when optimizing my plasma effects, some times 1 register 
makes the difference that's one reason to use "omit-frame-pointer" and static 
arrays.

SET 
------------------------------------ 0 --------------------------------
Visit my home page: http://set-soft.home.ml.org/
or
http://www.geocities.com/SiliconValley/Vista/6552/
Salvador Eduardo Tropea (SET). (Electronics Engineer)
Alternative e-mail: set-soft AT usa DOT net set AT computer DOT org
ICQ: 2951574
Address: Curapaligue 2124, Caseros, 3 de Febrero
Buenos Aires, (1678), ARGENTINA
TE: +(541) 759 0013