Mail Archives: djgpp/2000/04/17/20:54:51

From: "Alexei A. Frounze" <alex DOT fru AT mtu-net DOT ru>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: inefficiency of GCC output code & -O problem
Date: Tue, 18 Apr 2000 04:47:58 +0400
Organization: MTU-Intel ISP
Lines: 38
Message-ID: <>
References: <Pine DOT LNX DOT 4 DOT 10 DOT 10004161837540 DOT 1138-100000 AT darkstar DOT grendel DOT net> <38F9D717 DOT 9438A3F6 AT mtu-net DOT ru> <8df84a DOT 3vvqu6v DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> <38FB4094 DOT DE7B5F4C AT mtu-net DOT ru> <8dfum2 DOT 3vvqu6v DOT 0 AT buerssner-17104 DOT user DOT cis DOT dfn DOT de> <38FB7858 DOT 41B090DB AT mtu-net DOT ru>
Mime-Version: 1.0
X-Trace: 956018945 9232 (18 Apr 2000 00:49:05 GMT)
X-Complaints-To: usenet-abuse AT mtu DOT ru
NNTP-Posting-Date: 18 Apr 2000 00:49:05 GMT
Cc: buers AT gmx DOT de
X-Mailer: Mozilla 4.72 [en] (Win95; I)
X-Accept-Language: en,ru
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

"Alexei A. Frounze" wrote:
> Well, let me tell some words in conclusion. ;)
> 1. You simply proved that GCC has an optimizer efficient enough. Okay, I agree.
> Your code that works 2 FPS fater for you works the same for me as before. I
> think it doesn't mean faster than mine (just 2.9%).
> So, we have a good optimizer and you proved this. Great. I'm glad. This means I
> can throw away a lot of inline ASM now.
> 2. If I knew that (int)(x) is slow and if I had proper manual on inline ASM, I
> would achived the same but with less problems.
> 3. Dieter, I hope you won't try to convert span() to plane C. :)
> This replacement doesn't work even nearly fast:
> --------------8<----------------
>       while (n--) {
>         *scr++ = *(texture+((v1>>8)&0xFF00)+((u1>>16)&0xFF));
>         u1 += du;
>         v1 += dv;
>       };
> --------------8<----------------
> Anyway thank you. And btw, thank to myself. If I didn't write efficient C code
> between /* */ :), Dieter would never prove that GCC has a good optimizer because
> he doesn't know the tmapping algorithm (do you?).

And btw, the main trick:

4. The FIDIVRL instruction is executed in parallel with span(). If we remove
this inline code and put C instead, we'll loose performance. It's also a proof
that my inline asm is not a redundant thing. :)

Alexei A. Frounze

- Raw text -

  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019