delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1997/03/05/22:00:32

Newsgroups: comp.os.msdos.djgpp
From: Peter Berdeklis <peter AT atmosp DOT physics DOT utoronto DOT ca>
Subject: Re: Netlib code [was Re: flops...]
Message-ID: <Pine.SGI.3.91.970305110130.8933B-100000@atmosp.physics.utoronto.ca>
Nntp-Posting-Host: chinook.physics.utoronto.ca
Sender: news AT info DOT physics DOT utoronto DOT ca (System Administrator)
Mime-Version: 1.0
Organization: University of Toronto - Dept. of Physics
In-Reply-To: <199703030209.MAA08035@solwarra.gbrmpa.gov.au>
Date: Wed, 5 Mar 1997 16:11:39 GMT
References: <199703030209 DOT MAA08035 AT solwarra DOT gbrmpa DOT gov DOT au>
Lines: 24
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp

First, as has been mentioned in other threads, reducing the precision of 
the FPU to less than 64 bits does not generally reduce the execution 
time, although it does significantly reduce the precision.  I would 
suggest using 64 bit doubles unless space is a significant concern.

As to the problem of gcc reloading memory locations already in registers, 
this is not just a problem of FPU code.  I used the -S option to generate 
asm code for a interrupt handler that I wrote in C.  The asm code had _a 
lot_ of unnecessary register reloads that I had to eliminate.  I think 
that this is a deficiency in the gcc optimizer.  (Considering the 
performance of DJGPP relative to other compilers, I assume that the 
problem is not unique.)

Since I know nothing about compilers or teaching them how to optimize, I 
have no idea how to fix this other than to hand massage the asm output.  
However for library routines, like matrix multiply functions, I would 
suggest that the effort is well worth it since you'll only need to do it 
once.

---------------
Peter Berdeklis
Dept. of Physics, Univ. of Toronto

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019