Mail Archives: cygwin/2007/06/28/15:20:16
> From: cygwin-owner AT cygwin DOT com On Behalf Of Brian Dessent
> Sent: Thursday, June 28, 2007 3:02 PM
> To: cygwin AT cygwin DOT com
> Subject: Re: possible compiler optimization error
>
> I think Dave already explained it but in case it's not clear, on the
> i387, all floating point math happens at 80 bit registers, even if the
> underlying values are actually 32 bit (float) or 64 bit (double)
> quantities. This means there can be extra bits of precision in the
> register if the value has not been written to memory yet.
> -ffloat-store
> is kind of a hacky workaround to this problem that tells the
> compiler to
> try harder to write values to memory and read them back in whenever
> possible. It's not a guaranteed fix, and it has a negative
> performance
> hit.
>
> The real problem is not in the compiler, it's the crappy design of the
> i387. The best workaround is not to use the 387 unit at all if
> possible. This is what -mfpmath=sse does, as the sse unit
> was designed
> much more sanely so that it doesn't have this excess
> precision problem.
>
> Note that sse only has support for 32 bit floating point
> types, you need
> sse2 for 64 bit double types. And -march=i686 does not enable sse2
> because not all i686 class machines have sse2. So that is why I said
> "if you have a sse2 machine and set -march appropriately",
> meaning e.g.
> -march=pentium4 or -march=k8. That is why using "-march=i686" or
> "-march=i686 -msse" both fail, because neither imply sse2.
>
> Using "-march=i686 -msse2" doesn't make a lot of sense to me,
> because it
> generates code that will cause invalid instruction faults on i686
> machines without sse2 (e.g. ppro, celeron, pentium3, k7/athlon.) By
> giving -msse2 you're already limiting the architecture to pentium4/k8
> anyway, so you might as well just use the correct -march.
>
> This is all thankfully moot on x86_64, because there the 387 is
> obsoleted and essentially disabled entirely.
>
This is all very good information. Thank you all very much.
I was just reading http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
linked to by another posting on here.
Much like you say that -ffloat-store is a hacky workaround, on that bug
report it is said that -ffloat-store "may trigger instead of suppressing
the bug".
My using -march=i686 was because I couldn't find a list of all accepted
values in the man page for gcc. After some googling I found that I can
use -march=pentium-m for my Dell D600 Laptop. I am now happy to report
that setting -march=pentium-m -O2 works fine. I am glad to hear that
using the sse2 correctly solves the problem without having to use
-ffloat-store and taking a possible performance hit.
I should also mention that the Solaris machine I was using is a SPARC
and the Linux machine I was using is an Opteron.
It would be interesting to load SolarisX86 or Linux on the same Windows
laptop just to prove that it is the hardware.
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -