delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2000/04/18/05:23:21

From: "Alexei A. Frounze" <alex DOT fru AT mtu-net DOT ru>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: inefficiency of GCC output code & -O problem
Date: Tue, 18 Apr 2000 11:27:51 +0400
Organization: MTU-Intel ISP
Lines: 95
Message-ID: <38FC0E77.904B12BE@mtu-net.ru>
References: <Pine DOT LNX DOT 4 DOT 10 DOT 10004161837540 DOT 1138-100000 AT darkstar DOT grendel DOT net> <38F9D717 DOT 9438A3F6 AT mtu-net DOT ru> <8df84a DOT 3vvqu6v DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> <38FB4094 DOT DE7B5F4C AT mtu-net DOT ru> <8dfum2 DOT 3vvqu6v DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> <38FB7858 DOT 41B090DB AT mtu-net DOT ru> <8dh6kr DOT 3vvqvqr DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de>
NNTP-Posting-Host: ppp101-157.dialup.mtu-net.ru
Mime-Version: 1.0
X-Trace: gavrilo.mtu.ru 956043528 86544 212.188.101.157 (18 Apr 2000 07:38:48 GMT)
X-Complaints-To: usenet-abuse AT mtu DOT ru
NNTP-Posting-Date: 18 Apr 2000 07:38:48 GMT
Cc: buers AT gmx DOT de
X-Mailer: Mozilla 4.72 [en] (Win95; I)
X-Accept-Language: en,ru
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Dieter Buerssner wrote:
> 
> Alexei A. Frounze wrote:
> 
> >3. Dieter, I hope you won't try to convert span() to plane C. :)
>                                                       ^^^^^
> (Nice misspelling. With optimizing plane C-compiler, you shouldn't
> need any assembly for 3d graphics ;)

So why Doom,Quake,... and Allegro have ASM???


> >      while (n--) {
> >        *scr++ = *(texture+((v1>>8)&0xFF00)+((u1>>16)&0xFF));
> >        u1 += du;
> >        v1 += dv;
> >      };
>         ^
> Why this semicolon? The same thing I see everywhere in your sources.

Do you think this semicolon makes something slower? Only compilation is slower
by 10^-X seconds, where X is not 0 or 1.

> Assuming n >= 0, and taking the liberty of slightly changing
> your interface (the pointers are not needed), I got after a few
> minutes:
> 
> /* Add this to the top of T_Map() */
> static void
> span2(char *scr, char *texture, int n, int u1, int v1, int du, int dv)
> {
>   switch (n&3)
>   {
>     case 3:
>       *scr++ = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)];
>       u1 += du;
>       v1 += dv;
>     case 2:
>       *scr++ = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)];
>       u1 += du;
>       v1 += dv;
>     case 1:
>       *scr++ = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)];
>       u1 += du;
>       v1 += dv;
>   }
>   if ((n >>= 2) != 0)
>   {
>     do
>     {
>       scr[0] = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)];
>       u1 += du;
>       v1 += dv;
>       scr[1] = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)];
>       u1 += du;
>       v1 += dv;
>       scr[2] = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)];
>       u1 += du;
>       v1 += dv;
>       scr[3] = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)];
>       u1 += du;
>       v1 += dv;
>       scr += 4;
>     }
>     while (--n != 0);
>   }
> }
> 
> I replaced
> 
>       span (scr, texture, n, &u1, &v1, du, dv);
> 
> by
> 
>       span2(scr, texture, n, u1, v1, du, dv);
> 
> in T_Map(). Speed went up by 2 FPS ;)

Neither here. It's 42...57 instead of 45...70.
(USEC=USEC2=1, -O switch)

> I must admit, that this is really surprising. A fast look at
> your assembly implementation has shown: I don't understand it.
> And I actually feel no desire at all to understand it.
> But it certainly looks fast. So, your results may differ.

And they differ. 57 vs 70. Is it an improvement???


Alexei A. Frounze
-----------------------------------------
Homepage: http://alexfru.chat.ru
Mirror:   http://members.xoom.com/alexfru


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019