delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2007/06/28/15:20:16

X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
Subject: RE: possible compiler optimization error
Date: Thu, 28 Jun 2007 15:19:48 -0400
Message-ID: <C6EEDB0EB45A56439F73B1D23E39694A35C841@USORL02P702.ww007.siemens.net>
In-Reply-To: <4684058B.DFC1CA09@dessent.net>
References: <C6EEDB0EB45A56439F73B1D23E39694A35C7EC AT USORL02P702 DOT ww007 DOT siemens DOT net> <4683F56D DOT 53B8E259 AT dessent DOT net> <C6EEDB0EB45A56439F73B1D23E39694A35C819 AT USORL02P702 DOT ww007 DOT siemens DOT net> <4684058B DOT DFC1CA09 AT dessent DOT net>
From: "Frederich, Eric P21322" <eric DOT frederich AT siemens DOT com>
To: <cygwin AT cygwin DOT com>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id l5SJK496000414

> From: cygwin-owner AT cygwin DOT com On Behalf Of Brian Dessent
> Sent: Thursday, June 28, 2007 3:02 PM
> To: cygwin AT cygwin DOT com
> Subject: Re: possible compiler optimization error
> 
> I think Dave already explained it but in case it's not clear, on the
> i387, all floating point math happens at 80 bit registers, even if the
> underlying values are actually 32 bit (float) or 64 bit (double)
> quantities.  This means there can be extra bits of precision in the
> register if the value has not been written to memory yet.  
> -ffloat-store
> is kind of a hacky workaround to this problem that tells the 
> compiler to
> try harder to write values to memory and read them back in whenever
> possible.  It's not a guaranteed fix, and it has a negative 
> performance
> hit.
> 
> The real problem is not in the compiler, it's the crappy design of the
> i387.  The best workaround is not to use the 387 unit at all if
> possible.  This is what -mfpmath=sse does, as the sse unit 
> was designed
> much more sanely so that it doesn't have this excess 
> precision problem.
> 
> Note that sse only has support for 32 bit floating point 
> types, you need
> sse2 for 64 bit double types.  And -march=i686 does not enable sse2
> because not all i686 class machines have sse2.  So that is why I said
> "if you have a sse2 machine and set -march appropriately", 
> meaning e.g.
> -march=pentium4 or -march=k8.  That is why using "-march=i686" or
> "-march=i686 -msse" both fail, because neither imply sse2.
> 
> Using "-march=i686 -msse2" doesn't make a lot of sense to me, 
> because it
> generates code that will cause invalid instruction faults on i686
> machines without sse2 (e.g.  ppro, celeron, pentium3, k7/athlon.)  By
> giving -msse2 you're already limiting the architecture to pentium4/k8
> anyway, so you might as well just use the correct -march.
> 
> This is all thankfully moot on x86_64, because there the 387 is
> obsoleted and essentially disabled entirely.
> 

This is all very good information.  Thank you all very much.
I was just reading http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
linked to by another posting on here.
Much like you say that -ffloat-store is a hacky workaround, on that bug
report it is said that -ffloat-store "may trigger instead of suppressing
the bug".

My using -march=i686 was because I couldn't find a list of all accepted
values in the man page for gcc.  After some googling I found that I can
use -march=pentium-m for my Dell D600 Laptop.  I am now happy to report
that setting -march=pentium-m -O2 works fine.  I am glad to hear that
using the sse2 correctly solves the problem without having to use
-ffloat-store and taking a possible performance hit.

I should also mention that the Solaris machine I was using is a SPARC
and the Linux machine I was using is an Opteron.
It would be interesting to load SolarisX86 or Linux on the same Windows
laptop just to prove that it is the hardware.

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019