From: "Alexei A. Frounze" Newsgroups: comp.os.msdos.djgpp Subject: Re: inefficiency of GCC output code & -O problem Date: Tue, 18 Apr 2000 11:27:51 +0400 Organization: MTU-Intel ISP Lines: 95 Message-ID: <38FC0E77.904B12BE@mtu-net.ru> References: <38F9D717 DOT 9438A3F6 AT mtu-net DOT ru> <8df84a DOT 3vvqu6v DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> <38FB4094 DOT DE7B5F4C AT mtu-net DOT ru> <8dfum2 DOT 3vvqu6v DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> <38FB7858 DOT 41B090DB AT mtu-net DOT ru> <8dh6kr DOT 3vvqvqr DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> NNTP-Posting-Host: ppp101-157.dialup.mtu-net.ru Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit X-Trace: gavrilo.mtu.ru 956043528 86544 212.188.101.157 (18 Apr 2000 07:38:48 GMT) X-Complaints-To: usenet-abuse AT mtu DOT ru NNTP-Posting-Date: 18 Apr 2000 07:38:48 GMT Cc: buers AT gmx DOT de X-Mailer: Mozilla 4.72 [en] (Win95; I) X-Accept-Language: en,ru To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com Dieter Buerssner wrote: > > Alexei A. Frounze wrote: > > >3. Dieter, I hope you won't try to convert span() to plane C. :) > ^^^^^ > (Nice misspelling. With optimizing plane C-compiler, you shouldn't > need any assembly for 3d graphics ;) So why Doom,Quake,... and Allegro have ASM??? > > while (n--) { > > *scr++ = *(texture+((v1>>8)&0xFF00)+((u1>>16)&0xFF)); > > u1 += du; > > v1 += dv; > > }; > ^ > Why this semicolon? The same thing I see everywhere in your sources. Do you think this semicolon makes something slower? Only compilation is slower by 10^-X seconds, where X is not 0 or 1. > Assuming n >= 0, and taking the liberty of slightly changing > your interface (the pointers are not needed), I got after a few > minutes: > > /* Add this to the top of T_Map() */ > static void > span2(char *scr, char *texture, int n, int u1, int v1, int du, int dv) > { > switch (n&3) > { > case 3: > *scr++ = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)]; > u1 += du; > v1 += dv; > case 2: > *scr++ = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)]; > u1 += du; > v1 += dv; > case 1: > *scr++ = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)]; > u1 += du; > v1 += dv; > } > if ((n >>= 2) != 0) > { > do > { > scr[0] = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)]; > u1 += du; > v1 += dv; > scr[1] = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)]; > u1 += du; > v1 += dv; > scr[2] = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)]; > u1 += du; > v1 += dv; > scr[3] = texture[((v1>>8)&0xFF00)+((u1>>16)&0xFF)]; > u1 += du; > v1 += dv; > scr += 4; > } > while (--n != 0); > } > } > > I replaced > > span (scr, texture, n, &u1, &v1, du, dv); > > by > > span2(scr, texture, n, u1, v1, du, dv); > > in T_Map(). Speed went up by 2 FPS ;) Neither here. It's 42...57 instead of 45...70. (USEC=USEC2=1, -O switch) > I must admit, that this is really surprising. A fast look at > your assembly implementation has shown: I don't understand it. > And I actually feel no desire at all to understand it. > But it certainly looks fast. So, your results may differ. And they differ. 57 vs 70. Is it an improvement??? Alexei A. Frounze ----------------------------------------- Homepage: http://alexfru.chat.ru Mirror: http://members.xoom.com/alexfru