delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2007/06/28/16:36:34

X-Spam-Check-By: sourceware.org
Message-ID: <46841BBB.33BC6354@dessent.net>
Date: Thu, 28 Jun 2007 13:36:11 -0700
From: Brian Dessent <brian AT dessent DOT net>
X-Mailer: Mozilla 4.79 [en] (Windows NT 5.0; U)
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: possible compiler optimization error
References: <BAY108-F1181298FD58A847ABB9088BE090 AT phx DOT gbl> <46840F0E DOT 9EA8612B AT dessent DOT net> <C6EEDB0EB45A56439F73B1D23E39694A35C856 AT USORL02P702 DOT ww007 DOT siemens DOT net>
X-IsSubscribed: yes
Reply-To: cygwin AT cygwin DOT com
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

"Frederich, Eric P21322" wrote:

> You said that combining -march=i686 and -msse2 didn't make too much
> sense.

What I meant by that is that by specifying -msse2 you are setting the
bar a lot higher than -march=i686, generating code that won't run on a
number of i686 machines, so you might as well use a more specifc -march
that includes sse2 anyway.

> So without setting -march, what all should I be setting?
> On my laptop with CPU-Z I see MMX, SSE, and SSE2.
> On my Opteron Linux box I obviously see a lot more when I cat
> /proc/cpuinfo.

If you're compiling code for yourself then just use the appropriate arch
for each machine, -march=pentium-m and -march=opteron respectively.

If you're going to distribute binaries to others than I guess it gets a
little more complicated.  If you're comfortable requiring sse2 then I
suppose -march=i686 -msse2 is reasonable.  You might also test
-march=i686 -mtune=pentium4 -msse2.  What this means is choose the
instruction set of generic i686 plus sse2, but choose the scheduler for
p4.  Due to its huge pipeline the p4 is more sensitive to scheduling
than the other sse2-class machines like k8, so in theory this means a
small performance win on p4 machines without much (if any) cost on
k8/core2/whatever.  But I might be wrong here, so if performance is of
any concern you should test it.

> If I just use what is common between them, -mmmx, -msse, and -msse2 I
> should be free of floating point errors and hopefully get some
> performance increase.  Should I be using -mmmx if I'm also using -msse
> and -msse2?

Well, first of all, yes, any machine capable of sse2 will also include
sse and mmx, so it's redundant to specify all three.  But sse and mmx
aren't really relevant at all if you're using 'double' types.

Brian

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019