Mail Archives: pgcc/1998/03/12/16:52:25
[cc sent to beastium-list]
The recent postings about paranoia and fpu result mismatches with
optimized/unoptimized compilations need some clarifications, I hope to
provide it now ;)
> Marc -
>
> What do you make of the following code. PGCC produces different
> results when optimizing then when not optimization. I was
> told it has to do with -fno-float-store, but pgcc doesn't appear
the x86 chips are not really ieee compliant. that's not too serious, as I'll
explain:
> y = 1.0;
> x = 1.0 + y;
> oldx = 1.0;
> do
> {
> y /= 2.0;
> oldx = x;
> x = 1.0 + y;
> } while ( x < oldx );
gcc options value of y
-O0 5.551115e-17
-O 2.710505e-20
-O -ffloat-store 5.551115e-17
let's analyze that loop:
when will it stop? it will stop when x !< x + y, where y is 1.0, 0.5, 0.25
etc... (continously halved). "double"'s have 53 bit's mantissa (only 52 are
stored). 2^53 ~ 10^15, i.e. we will have roughly 16 digits precision.
This is why the loop stops at y = 5*10^-17. This is compliant to the ieee
rule that intermediate values have to be at the same precision as the
result.
Now, when optimizing, gcc tries to keep values in the floating point
registers. But other than, say, the m68k fpu, the x86 fpu has no notion of
different data types, all fpu registers have the same type, long double (80
bits extended format). 80 bits extended format => 64 bits mantissa => ~19
digits precision.
This is why, in the second example, you get the 2*10^-20. This is fast, more
accurate than needed, but not IEEE compliant.
When you specify -ffloat-store, you force gcc to store _every_ _intermediate_
_value_ in memory, as to truncate the value to appropriate precision (just
like in an unoptimized program). This is why we, again, get the correct
answer with "-O -ffloat-store".
As all of you correctly guessed, this is helplessly slow, but the only way
to force correct behaviour. In most cases, this is of no concern, since
programs generally appreciate the extra precision they get, this is only of
concern to programs that specifically need the exact (less precise)
representation, like paranoia, which get's really confused when "double"
seems to have different precision depending on the surrounding code.
Any questions left? Don't hesitate - Ask!
-----==- |
----==-- _ |
---==---(_)__ __ ____ __ Marc Lehmann +--
--==---/ / _ \/ // /\ \/ / pcg AT goof DOT com |e|
-=====/_/_//_/\_,_/ /_/\_\ --+
The choice of a GNU generation |
|
- Raw text -