delorie.com/archives/browse.cgi   search  
Mail Archives: pgcc/1998/03/12/17:22:20

X-pop3-spooler: POP3MAIL 2.1.0 b 3 961213 -bs-
Delivered-To: pcg AT goof DOT com
Message-ID: <35081A31.16F7D911@pfh.research.philips.com>
Date: Thu, 12 Mar 1998 18:24:01 +0100
From: Thomas Koehler <T DOT Koehler AT pfh DOT research DOT philips DOT com>
Organization: Philips
X-Mailer: Mozilla 3.01 (X11; I; Linux 2.0.32 i686)
MIME-Version: 1.0
To: Marc Lehmann <pcg AT goof DOT com>
Subject: Re: paranoia & extra precision [was -fno-float-store in pgcc]
References: <199803111756 DOT AA192209001 AT typhoon DOT rose DOT hp DOT com> <19980312074656 DOT 14562 AT cerebro DOT laendle>
Sender: Marc Lehmann <pcg AT goof DOT com>
Status: RO
X-Status: A
Lines: 110

Marc Lehmann wrote:
> 
> [cc sent to beastium-list]
> 
> The recent postings about paranoia and fpu result mismatches with
> optimized/unoptimized compilations need some clarifications, I hope to
> provide it now ;)
> 
> > Marc -
> >
> > What do you make of the following code.   PGCC produces different
> > results when optimizing then when not optimization.  I was
> > told it has to do with -fno-float-store, but pgcc doesn't appear
> 
> the x86 chips are not really ieee compliant. that's not too serious, as I'll
> explain:
> 
> >      y = 1.0;
> >      x = 1.0 + y;
> >      oldx = 1.0;
> >      do
> >      {
> >          y /= 2.0;
> >          oldx = x;
> >          x = 1.0 + y;
> >      } while ( x < oldx );
> 
> gcc options             value of y
> -O0                     5.551115e-17
> -O                      2.710505e-20
> -O -ffloat-store        5.551115e-17
> 
> let's analyze that loop:
> 
> when will it stop? it will stop when x !< x + y, where y is 1.0, 0.5, 0.25
> etc... (continously halved). "double"'s have 53 bit's mantissa (only 52 are
> stored). 2^53 ~ 10^15, i.e. we will have roughly 16 digits precision.
> 
> This is why the loop stops at y = 5*10^-17. This is compliant to the ieee
> rule that intermediate values have to be at the same precision as the
> result.
> 
> Now, when optimizing, gcc tries to keep values in the floating point
> registers. But other than, say, the m68k fpu, the x86 fpu has no notion of
> different data types, all fpu registers have the same type, long double (80
> bits extended format). 80 bits extended format => 64 bits mantissa => ~19
> digits precision.
> 
> This is why, in the second example, you get the 2*10^-20. This is fast, more
> accurate than needed, but not IEEE compliant.
> 
> When you specify -ffloat-store, you force gcc to store _every_ _intermediate_
> _value_ in memory, as to truncate the value to appropriate precision (just
> like in an unoptimized program). This is why we, again, get the correct
> answer with "-O -ffloat-store".
> 
> As all of you correctly guessed, this is helplessly slow, but the only way
> to force correct behaviour. 

There seems to be annother way, but involes the unducumented functions
__getfpucw and __setfpucw. By them the fpu-behaviour can be changed,
including internal rounding an SIGFPE generation. See
/usr/include/i386/fpu_control.h for a short information. The following
code works at least for gcc272 an may work for pgcc as well:

#include <stdio.h>
#include <i386/fpu_control.h>
extern unsigned short __getfpucw ( );


void main ( )
{	double y, x, oldx;
	unsigned short	u;

	u = __getfpucw ();
	u &= 0xFEFF;
	(void) __setfpucw ( u );

	
	y = 1.0;
	x = 1.0 + y;
	oldx = 1.0;
	do
	{
	    y /= 2.0;
	    oldx = x;
	    x = 1.0 + y;
	} while ( x < oldx );
	
	printf ("%g\n", y);
}

btw. in constrast to the info in fpu_control.h, floating point
exceptions are disabled by default.


Thomas
-- 



Thomas Koehler
Philips Research Laboratories
Division Technical Systems
Roentgenstrasse 24-26
D-22335 Hamburg
Germany

phone:  +40/5078-2103
e-mail: T DOT Koehler AT PFH DOT Research DOT Philips DOT com

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019