X-Spam-Check-By: sourceware.org Message-ID: <4684058B.DFC1CA09@dessent.net> Date: Thu, 28 Jun 2007 12:01:31 -0700 From: Brian Dessent X-Mailer: Mozilla 4.79 [en] (Windows NT 5.0; U) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: possible compiler optimization error References: <4683F56D DOT 53B8E259 AT dessent DOT net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Reply-To: cygwin AT cygwin DOT com Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com "Frederich, Eric P21322" wrote: > I do realize that they may in fact differ way out there beyond 15 > decimal places. > What I don't understand is how two numbers pass a ==, then fail a >=, > then pass a >= unless (after compiler optimizations) the second and > third comparisons are actually comparing copies of these numbers which > aren't "bit-exact" copies. > Is this what you're saying might be happening and what -ffloat-store is > supposed to resolve? > If so, that makes sense and I can accept that. I think Dave already explained it but in case it's not clear, on the i387, all floating point math happens at 80 bit registers, even if the underlying values are actually 32 bit (float) or 64 bit (double) quantities. This means there can be extra bits of precision in the register if the value has not been written to memory yet. -ffloat-store is kind of a hacky workaround to this problem that tells the compiler to try harder to write values to memory and read them back in whenever possible. It's not a guaranteed fix, and it has a negative performance hit. The real problem is not in the compiler, it's the crappy design of the i387. The best workaround is not to use the 387 unit at all if possible. This is what -mfpmath=sse does, as the sse unit was designed much more sanely so that it doesn't have this excess precision problem. Note that sse only has support for 32 bit floating point types, you need sse2 for 64 bit double types. And -march=i686 does not enable sse2 because not all i686 class machines have sse2. So that is why I said "if you have a sse2 machine and set -march appropriately", meaning e.g. -march=pentium4 or -march=k8. That is why using "-march=i686" or "-march=i686 -msse" both fail, because neither imply sse2. Using "-march=i686 -msse2" doesn't make a lot of sense to me, because it generates code that will cause invalid instruction faults on i686 machines without sse2 (e.g. ppro, celeron, pentium3, k7/athlon.) By giving -msse2 you're already limiting the architecture to pentium4/k8 anyway, so you might as well just use the correct -march. This is all thankfully moot on x86_64, because there the 387 is obsoleted and essentially disabled entirely. Brian -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/