X-Spam-Check-By: sourceware.org Message-ID: <46841BBB.33BC6354@dessent.net> Date: Thu, 28 Jun 2007 13:36:11 -0700 From: Brian Dessent X-Mailer: Mozilla 4.79 [en] (Windows NT 5.0; U) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: possible compiler optimization error References: <46840F0E DOT 9EA8612B AT dessent DOT net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Reply-To: cygwin AT cygwin DOT com Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com "Frederich, Eric P21322" wrote: > You said that combining -march=i686 and -msse2 didn't make too much > sense. What I meant by that is that by specifying -msse2 you are setting the bar a lot higher than -march=i686, generating code that won't run on a number of i686 machines, so you might as well use a more specifc -march that includes sse2 anyway. > So without setting -march, what all should I be setting? > On my laptop with CPU-Z I see MMX, SSE, and SSE2. > On my Opteron Linux box I obviously see a lot more when I cat > /proc/cpuinfo. If you're compiling code for yourself then just use the appropriate arch for each machine, -march=pentium-m and -march=opteron respectively. If you're going to distribute binaries to others than I guess it gets a little more complicated. If you're comfortable requiring sse2 then I suppose -march=i686 -msse2 is reasonable. You might also test -march=i686 -mtune=pentium4 -msse2. What this means is choose the instruction set of generic i686 plus sse2, but choose the scheduler for p4. Due to its huge pipeline the p4 is more sensitive to scheduling than the other sse2-class machines like k8, so in theory this means a small performance win on p4 machines without much (if any) cost on k8/core2/whatever. But I might be wrong here, so if performance is of any concern you should test it. > If I just use what is common between them, -mmmx, -msse, and -msse2 I > should be free of floating point errors and hopefully get some > performance increase. Should I be using -mmmx if I'm also using -msse > and -msse2? Well, first of all, yes, any machine capable of sse2 will also include sse and mmx, so it's redundant to specify all three. But sse and mmx aren't really relevant at all if you're using 'double' types. Brian -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/