Mail Archives: cygwin/2023/09/02/15:59:23
X-Recipient: | archive-cygwin AT delorie DOT com
|
DKIM-Filter: | OpenDKIM Filter v2.11.0 sourceware.org 0E2663858425
|
DKIM-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
|
| s=default; t=1693684762;
|
| bh=ge8GxtfS7D9nos6hRxlIsNECfRmDtH5FEAWXboQyrl8=;
|
| h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
|
| List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
|
| From;
|
| b=CZk0w8lrBGkiSVEEdGnrgv6aUbK3JdATi+iGW1hCFbSUcu0wxrPr1y8ACiTQTov/2
|
| hqfO4rCYY7GI7d7nM3PqtShpyim6nyAMK79sK6HJ+1PkIeyWHi1li1BsIzfauO1il+
|
| 9JZDFw1W7Y98crSDlgdobHZYVx2ALfknpLTgob7M=
|
X-Original-To: | cygwin AT cygwin DOT com
|
Delivered-To: | cygwin AT cygwin DOT com
|
DMARC-Filter: | OpenDMARC Filter v1.4.2 sourceware.org 0C1363858CD1
|
X-Authority-Analysis: | v=2.4 cv=e5oV9Il/ c=1 sm=1 tr=0 ts=64f39406
|
| a=DxHlV3/gbUaP7LOF0QAmaA==:117 a=DxHlV3/gbUaP7LOF0QAmaA==:17
|
| a=IkcTkHD0fZMA:10 a=6whCB0ajAAAA:8 a=YbKMr47jpe0mhzbFp54A:9 a=QEXdDO2ut3YA:10
|
| a=yGKlR3sVXOPn9h0UBopn:22
|
Message-ID: | <07386659-68b3-a35d-1402-22684f8e5755@Shaw.ca>
|
Date: | Sat, 2 Sep 2023 13:59:01 -0600
|
MIME-Version: | 1.0
|
User-Agent: | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
|
| Thunderbird/102.15.0
|
Subject: | Re: posix thread scaling issue
|
To: | cygwin AT cygwin DOT com
|
References: | <550e8950-8f7a-4765-b23e-57d0e710fde0 AT jeffunit DOT com>
|
| <2cfbcf8d-911f-a64b-8916-12b005c9f6f6 AT Shaw DOT ca>
|
| <cf618819-c30c-439d-ad5f-54b2311bd936 AT jeffunit DOT com>
|
Organization: | Inglis
|
In-Reply-To: | <cf618819-c30c-439d-ad5f-54b2311bd936@jeffunit.com>
|
X-CMAE-Envelope: | MS4xfBpkrTKBTW4q0iFAOBHgGzMpj3ulU7G3AvD+N3KuZQQG1rOinpHaje29O0yKk9M/mmH6haO1kTPCAaLmZZSKpLk5LAONBt0iCJe2hCVcEkPhpSoMzYf7
|
| 8khRfAPHY6VGoZOAkAohNTJtEar0zcNqw6IiXhWZbWAwumDZYy8P0dzEjIze+6zaufGjljfKMXIWbFLGPKiA2ezZHKnhZeFsCH8=
|
X-Spam-Status: | No, score=-2.6 required=5.0 tests=BAYES_00, BODY_8BITS,
|
| DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A,
|
| RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,
|
| SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6
|
X-Spam-Checker-Version: | SpamAssassin 3.4.6 (2021-04-09) on
|
| server2.sourceware.org
|
X-BeenThere: | cygwin AT cygwin DOT com
|
X-Mailman-Version: | 2.1.30
|
List-Id: | General Cygwin discussions and problem reports <cygwin.cygwin.com>
|
List-Unsubscribe: | <https://cygwin.com/mailman/options/cygwin>,
|
| <mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
|
List-Archive: | <https://cygwin.com/pipermail/cygwin/>
|
List-Post: | <mailto:cygwin AT cygwin DOT com>
|
List-Help: | <mailto:cygwin-request AT cygwin DOT com?subject=help>
|
List-Subscribe: | <https://cygwin.com/mailman/listinfo/cygwin>,
|
| <mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
|
From: | Brian Inglis via Cygwin <cygwin AT cygwin DOT com>
|
Reply-To: | cygwin AT cygwin DOT com
|
Cc: | Brian Inglis <Brian DOT Inglis AT Shaw DOT ca>
|
Errors-To: | cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
|
Sender: | "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>
|
X-MIME-Autoconverted: | from base64 to 8bit by delorie.com id 382JxMZ3011756
|
On 2023-09-02 12:27, jeff via Cygwin wrote:
> On 9/2/2023 10:56, Brian Inglis wrote:
>> On 2023-09-02 08:57, jeff via Cygwin wrote:
>>> I have a program that is embarrassing parallel.
>>> On my older computer which has an epyc 7302 (16 cores, 32 threads) it scales
>>> very well using cygwin, and fully utilized all threads.
>>> On my new computer which has an epyc 7B13 (64 cores, 128 threads) it does not
>>> scale very well.
>>> According to the windows task manager, it only uses 74% of the cpu resources.
>>> The time it takes the program to run on windows is 166 seconds.
>>> Using the same hardware on a recent version of linux, I can get 100% cpu
>>> utilization and the program takes 100 seconds to run.
>>> I suspect there may be something in cygwin that doesn't scale well with lots
>>> of posix threads.
Both Windows and Cygwin support multiple processor groups, as some developers,
maintainers, and users need support on such systems, and the process and thread
support has been added to Cygwin.
>>> I know this is a bit of an unusual situation, but you can buy a 128 core /
>>> 256 thread system now.
>>> Enclosed is the output of cygcheck.
>>> I updated my version of cygwin to be current as of today, Sep 2 2023.
>> What Windows edition and version are you running?
>> For details run:
>>
>> $ reg query "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion" \
>> | sed '/^\s\+\.*\s/!d;/^.\{80,\}/d'
>>
>> Some retail editions limit you to 64 threads and that seems to be your case:
>>
>> Â Â Â Â NUMBER_OF_PROCESSORS = '64'
>>
>> To make full use of your processors, you may have to upgrade your Windows to a
>> commercial licence (and installation) of Windows 10/11 Pro for Workstations,
>> enabling server features on non-server "Worskations" ~ HEDTs (High-End
>> DeskTops); see:
>>
>> https://www.anandtech.com/show/15483/amd-threadripper-3990x-review/3
>>
>> or just run Linux!
>>
>> Watch out for terms misused like processor == socket on some sites!
>>
>> Also, you have to consider these are server systems, mainly designed for VM
>> not HPC (High Performance Computing) parallelism.
>>
>> Your older system has higher base and boost/turbo clocks 3.0-3.3GHz: your
>> newer system has lower clocks 2.25-2.65/3/3.5GHz which seems to depend on OEM
>> target.
>>
>> You may also need to upgrade your memory, as each core could run ~10GB/s
>> instructions, and these workstations are often provisioned with 128-256GB
>> (2-4GB/core), so that may also need a Windows edition upgrade.
> I am running windows 10 professional. Using the task manager, 64 cores and 128
> threads shows up for my processor.
As the linked AnandTech article shows and explains with Task Manager/
Performance tab, Win 10 Pro may think you have dual sockets, that limits the
maximum thread parallelism you can achieve:
"Now the thing is, Workstation and Enterprise are built with multiple processor
groups in mind, whereas Pro is not."
> Here is the output of your regex:
>    SystemRoot   REG_SZ   C:\Windows
>    BaseBuildRevisionNumber   REG_DWORD   0x1
>    BuildBranch   REG_SZ   vb_release
> Â Â Â BuildGUIDÂ Â Â REG_SZÂ Â Â ffffffff-ffff-ffff-ffff-ffffffffffff
>    BuildLab   REG_SZ   19041.vb_release.191206-1406
>    BuildLabEx   REG_SZ   19041.1.amd64fre.vb_release.191206-1406
> Â Â Â CompositionEditionIDÂ Â Â REG_SZÂ Â Â Enterprise
>    CurrentBuild   REG_SZ   19045
>    CurrentBuildNumber   REG_SZ   19045
>    CurrentMajorVersionNumber   REG_DWORD   0xa
>    CurrentMinorVersionNumber   REG_DWORD   0x0
>    CurrentType   REG_SZ   Multiprocessor Free
>    CurrentVersion   REG_SZ   6.3
> Â Â Â EditionIDÂ Â Â REG_SZÂ Â Â Professional
>    EditionSubManufacturer   REG_SZ
>    EditionSubstring   REG_SZ
>    EditionSubVersion   REG_SZ
>    InstallationType   REG_SZ   Client
>    InstallDate   REG_DWORD   0x61e2300a
>    ProductName   REG_SZ   Windows 10 Pro
>    ReleaseId   REG_SZ   2009
>    SoftwareType   REG_SZ   System
> Â Â Â UBRÂ Â Â REG_DWORDÂ Â Â 0xcfc
>    PathName   REG_SZ   C:\Windows
>    ProductId   REG_SZ   00330-80000-00000-AA073
>    DisplayVersion   REG_SZ   22H2
>    RegisteredOwner   REG_SZ   jdeifik
>    RegisteredOrganization   REG_SZ
>    InstallTime   REG_QWORD   0x1d809b6d4ce7b09
>
> In practice, but the new and old processors typically run at about 3ghz when
> under load.
> When idling, both processors use about the same amount of power.
>
> I have 128gb of ram, in 4 slots. Using that configuration, I can get 100% load
> and significant faster performance on linux.
> Therefore I conclude the issue is either with windows or cygwin, and is not a
> hardware issue.
>
> When I run cinebench, I can get to 100% cpu utulization (at around 3ghz) on
> windows.
Chances are the benchmark is designed to handle that:
"When the program is running inside the group, unless it is processor group
aware, then it can only access other threads in the same group. This means that
if a multi-threaded program can use 128 threads, if it isn’t built with
processor groups in mind, then it might only spawn with access to 64."
I also do not know how you would program for that in Cygwin to map onto the
equivalent Windows function required.
Perhaps one of the developers involved could comment here?
> As for what the processors are 'designed' for, I really don't care.
> I want a reliable, fast computer with ECC memory, and I can get that with an
> EPYC processor.
> If a workload needs more than 128gb of memory, you pretty much need to use
> server processors.
> I can put in up to 2tb of memory in my system, if I have the need for that.
As I suggested above, and as the AT tests suggest, with your configuration, you
may get better results disabling multithreading on your current system, or
running Pro for Workstations, which you may be able to test using a generic key.
Pro for Workstations is used and recommended by video shops, with much lower
costs and power consumption running AMD than Intel, as a designer's workstation
alternative getting better performance and reponsiveness than using servers for
the same task.
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
- Raw text -