delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2006/03/21/22:18:39

X-Spam-Check-By: sourceware.org
Message-Id: <7.0.1.0.0.20060321190327.01d5def0@weasel.com>
Date: Tue, 21 Mar 2006 19:10:58 -0800
To: tprince AT computer DOT org, cygwin AT cygwin DOT com
From: jdeifik <jdeifik AT weasel DOT com>
Subject: Re: pthreads don't scale on windows xp, but does scale on linux, cygwin 1.5.19
In-Reply-To: <4420BDB9.1040105@myrealbox.com>
References: <25082fe70603210232uc7e017ft8848c336a649c7dc AT mail DOT gmail DOT com> <44200AD3 DOT 7000303 AT byu DOT net> <25082fe70603210714h23ec44d8v7f1cc9c0f2d4ad30 AT mail DOT gmail DOT com> <7 DOT 0 DOT 1 DOT 0 DOT 0 DOT 20060321073317 DOT 01dda4f8 AT weasel DOT com> <4420BDB9 DOT 1040105 AT myrealbox DOT com>
Mime-Version: 1.0
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

At 07:00 PM 3/21/2006, you wrote:

>jdeifik wrote:
>>I have a dual xeon 2.4ghz machine with hypertreading enabled.
>>This gives me 4 logical processors.
>>The machine dual boots to windows xp sp2, and linux.
>>I have a highly parallelizable program I wrote, and I tested it 
>>running 1 to 8 threads,
>>running with no source changes on windows and linux.
>>Here is the performance on linux using gcc-3.4.3
>>threads
>>1    1436.41user 0.10system 7:16.37elapsed 100%CPU 
>>(0avgtext+0avgdata 0maxresident)k
>>2    436.00user 0.02system 3:38.15elapsed 199%CPU 
>>(0avgtext+0avgdata 0maxresident)k
>>3    369.15user 0.05system 2:03.48elapsed 298%CPU 
>>(0avgtext+0avgdata 0maxresident)k
>>4    359.77user 0.08system 1:42.95elapsed 349%CPU 
>>(0avgtext+0avgdata 0maxresident)k
>>6    357.83user 0.09system 1:40.94elapsed 354%CPU 
>>(0avgtext+0avgdata 0maxresident)k
>>8    358.79user 0.06system 1:41.80elapsed 352%CPU 
>>(0avgtext+0avgdata 0maxresident)k
>>To compute efficiency, take the single thread elapsed time/(# 
>>threads * threaded elapsed time)
>>There is virtually perfect scaling. 4 processors scale with an 
>>efficiency of about 103%.
>>For 6 and 8 threads, efficiency goes up a small amount.
>>
>>Here is the performance on windows xp using cygwin pthreads and gcc-3.4.4
>>1    434.60user 0.20system 7:16.47elapsed 99%CPU (0avgtext+0avgdata 
>>509696maxresident)k
>>2    441.78user 0.24system 3:42.06elapsed 199%CPU 
>>(0avgtext+0avgdata 510208maxresident)k
>>3    579.68user 0.15system 3:14.50elapsed 298%CPU 
>>(0avgtext+0avgdata 511232maxresident)k
>>4    675.39user 0.15system 2:51.50elapsed 393%CPU 
>>(0avgtext+0avgdata 512000maxresident)k
>>6    711.70user 0.18system 3:01.20elapsed 392%CPU 
>>(0avgtext+0avgdata 511488maxresident)k
>>8    683.35user 0.21system 2:56.05elapsed 388%CPU 
>>(0avgtext+0avgdata 512000maxresident)k
>>Things are fine for 2 threads, scaling with an efficiency of 96%
>>For 3 threads, scaling efficiency is 73%
>>For 4 threads, scaling efficiency is 62%
>>For 6 threads, scaling efficiency is 39%
>>For 8 threads, scaling efficiency is 30%
>
>Windows doesn't have HT aware scheduling, such as recent linux 
>schedulers incorporate.  Cygwin doesn't attempt to improve on the 
>Windows scheduler.  I won't ask for relevant details about your 
>linux, or how you managed to write a program which doesn't deliver 
>close to full performance at 2 threads, as that would take this even 
>further Off Topic. However, if you are getting good scaling to 2 
>threads, that should enable you to get all the dual processor 
>performance you can expect in Windows for practical purposes.  You 
>might try repeating your tests with HT disabled in BIOS.

My linux is mandrake 10.2, I suspect running kernel 2.6.11-13smp.
My program scales perfectly at 2 threads on linux. It also scales 
perfectly at 4 threads on linux.
The problem isn't with my program.

I am not sure why it is important to have a HT aware scheduler for 
Windows, when there are 4 or more
threads. I can see with 2 threads you would like to have one per 
physical processor.
With 4 or more threads, cygwin phtreads really sucks, 4->62%, 6->39%, 
8->30% efficiency.

It seems to me that more and more apps are turning to threading for 
performance,
and more and more hardware is available with multi-processor, 
multi-core, and multi-threading.

         Jeff Deifik 


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019