delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2015/11/19/15:25:16

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:message-id:date
:mime-version:in-reply-to:content-type
:content-transfer-encoding; q=dns; s=default; b=kSuG2P21G83XOMKc
4SBZzTpDJUosEMtPuacy5N81tGK+auUeoqbnW2gdkwtpz4X0UDg799KQPvh1yrzS
Xh1hnLtVZXfoI4WDHdNL38YhxjpKzIzoUVWqFT6GCk3MgJLwmYWn/P7+vLD2XcH+
9j6v9AUyWSxr9PNOTWQDeuCCyG4=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:message-id:date
:mime-version:in-reply-to:content-type
:content-transfer-encoding; s=default; bh=eNbVRlz7TFiBcdT3dYYCAG
VRuBo=; b=K8R8H8kj3eflXMcmPpoFWa9WNpXtz+56/yVGZWPqc5cxfN479dBCcO
+HXVkNHBo+RQ5fIJc4RGZmUWW2/K/EkwmwK4oREdDuM5/dt4b+GAEYzLxtRKXNqU
rML6jbgzwj7/HY35X6K3K9fDzP3hxb6IOg5azpXH3C/RPqLTXhSPQ=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.1 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2
X-HELO: m0.truegem.net
Subject: Re: Cygwin multithreading performance
To: cygwin AT cygwin DOT com
References: <CABPLASTtRK4mNxh0M_AnZgjJQ15kWPx+L=U=VCU3Wwi7jV_57A AT mail DOT gmail DOT com>
From: Mark Geisert <mark AT maxrnd DOT com>
Message-ID: <564E3017.90205@maxrnd.com>
Date: Thu, 19 Nov 2015 12:24:55 -0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:42.0) Gecko/20100101 Firefox/42.0 SeaMonkey/2.39
MIME-Version: 1.0
In-Reply-To: <CABPLASTtRK4mNxh0M_AnZgjJQ15kWPx+L=U=VCU3Wwi7jV_57A@mail.gmail.com>

Kacper Michajlow wrote:
> I recently noticed that Cygwin multithreading is very inefficient. I
> was repacking few git repositories and with Cygwin's git, it spawns
> threads but they are so badly synchronized that there is no speed gain
> over one thread and possible loose because of the overhead. On my
> machine I got 7-10% CPU usage while with git build with mingw easily
> uses 100%.
>
> You can find the code in question here
> https://github.com/git/git/blob/master/builtin/pack-objects.c#L1967-L2094
>
> Do you have any suggestions? Is there any chance to get MT workloads
> improved in Cygwin? In present days it is really big problem in my
> opinion.

Although there have been some issues with Cygwin pthreads reported and 
resolved, I can't recall complaints about their performance.  You don't 
supply much specific info so I had to guess that you must be doing 
something like 'git gc' to provoke calls to the code you quote.  Please 
give more info if I was mistaken.

I did an strace of 'git gc' over a small source tree I have and found:

> ~/src/cygwin-cygutils strace --mask=debug+syscall+thread -o git.strace git gc
> Counting objects: 1691, done.
> Delta compression using up to 4 threads.
> Compressing objects: 100% (398/398), done.
> Writing objects: 100% (1691/1691), done.
> Total 1691 (delta 1250), reused 1691 (delta 1250)
>
> ~/src/cygwin-cygutils grep "fork(" git.strace
>   350  111164 [main] git 360 fork: 0 = fork()
>    59  113379 [main] git 4980 fork: 360 = fork()
>   496  242346 [main] git 4980 fork: 368 = fork()
>   513  242585 [main] git 368 fork: 0 = fork()
>   828  589040 [main] git 4980 fork: 4968 = fork()
>   685  589341 [main] git 4968 fork: 0 = fork()
>   591  126631 [main] git 4968 fork: 1784 = fork()
>   483  126866 [main] git 1784 fork: 0 = fork()
>   618 2320996 [main] git 4980 fork: 2912 = fork()
>   558 2321259 [main] git 2912 fork: 0 = fork()
>   555 3023781 [main] git 4980 fork: 1612 = fork()
>   500 3024002 [main] git 1612 fork: 0 = fork()
>   766 3112383 [main] git 4980 fork: 1756 = fork()
>   681 3112655 [main] git 1756 fork: 0 = fork()

There's your problem.  Git is for some reason fork()ing to do its 
parallel operations.  fork() is very complicated to emulate on Windows 
and Cygwin's fork() is already known to be slow compared to native OS 
implementations.

Why is mingw faster?  Inspection of run-command.c in the git source tree 
(BTW thanks for the github link) shows that start_command() has two code 
paths divided by "#ifndef GIT_WINDOWS_NATIVE".  The Windows native path 
(e.g. mingw) doesn't fork() but instead spawns subprocesses.  On Cygwin 
the fork() path is used.  Git probably ought to use the spawn code path 
on Cygwin too.

I don't know offhand if this is something Cygwin's git maintainer would 
want to tackle or if it should be handled upstream but I'd guess the latter.
Hope this helps,

..mark

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019