delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2000/04/17/21:43:12

From: "Alexei A. Frounze" <alex DOT fru AT mtu-net DOT ru>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: inefficiency of GCC output code & -O problem
Date: Tue, 18 Apr 2000 05:15:05 +0400
Organization: MTU-Intel ISP
Lines: 61
Message-ID: <38FBB719.3915C530@mtu-net.ru>
References: <Pine DOT LNX DOT 4 DOT 10 DOT 10004180455310 DOT 1540-100000 AT darkstar DOT grendel DOT net>
NNTP-Posting-Host: ppp97-207.dialup.mtu-net.ru
Mime-Version: 1.0
X-Trace: gavrilo.mtu.ru 956020475 31802 212.188.97.207 (18 Apr 2000 01:14:35 GMT)
X-Complaints-To: usenet-abuse AT mtu DOT ru
NNTP-Posting-Date: 18 Apr 2000 01:14:35 GMT
Cc: buers AT gmx DOT de, kalum AT lintux DOT cx
X-Mailer: Mozilla 4.72 [en] (Win95; I)
X-Accept-Language: en,ru
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Kalum Somaratna aka Grendel wrote:
> 
> On 17 Apr 2000, Dieter Buerssner wrote:
> >
> > Alexei's code will "cache" some values on the FPU stack, which
> > gcc is not able to see (with the switches I used). Nevertheless,
> > even here, with the help of only one line of inline assembly,
> > it produces comparable results. Again, it would loose, when all
> > those references and adress-off operations would be omitted.
> > It should be clear, that the compiler won't reach the efficiency
> > of hand optimzed assembler code. Whether the relative small
> > difference here is worth all the trouble, ...
> 
> This is precicely the point that I was trying to make earlier, A good
> optimising compiler can produce code which is as fast/faster as hand
> optimised asm. And in this case it did...It upped the frame rate from 70
> from the assembly code to 72 FPS from the C code .....:-)
> 
> So as you can see Alexei writing the code in inline assembly, and adding
> all those "tricks"  didn't amount to much difference really. Whether
> getting a reduction!!! in 2 FPS is worth the pain of coding in assembly
> and also the reultatnt decrease in redability of the code is indeed
> questionable.

You've forgot (in fact, Dieter haven't mentioned) about the FIDIVRL instruction
executed in parallel to the span() function. This is a real trick that makes
difference. Even Dieter didn't change it and left this piece of my inline ASM
AS-IS.

With FIDIVRL trick FPS is 45...70, w/o 36...52. 
Check this yourself or ask Dieter. :)

But for sure the rest of my inline code may be replaced with C.

> This most often is what happens in the places I work, programmers without
> thinking of the speed of C code and also thinking that *they are very
> smart*, directly write a routine in assembly generating "hand sloptimized"
> code in the process and introducing countless bugs too...
> 
> > Alexei, I have made some fun. I hope I have made up for it, by this
> > post, that took actually longer to write, than the coding.
> > I will send you the modified source by email. The post hopefully
> > was of general interest.
> 
> Yes indeed Dieter, This was of certainly of great interest and we owe
> you a big thank you :-)

For sure thanks. Two clever heads much better than only one. :)

> It once again proved that a good optimising compiler can do a excellent
> job...also the wisdom of running a profiler and seing which routines take
> up most time...

Yes, we have proved. We also haven't trow away all my inline ASM. The FIDIVRL
trick is still alive. :)

bye.
Alexei A. Frounze
-----------------------------------------
Homepage: http://alexfru.chat.ru
Mirror:   http://members.xoom.com/alexfru

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019