X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:references:to:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=s22d96rIXpR6dwEv mGRZB0xlW32vVyj1y9LVpL8HuINvkfLD5sGx5pW9foayvPVB8BzX77L+d/RnY6dq Oob/tKUVmqBbo6fgCGnWLKJ3WXSs7mLxvRPAt2cp55kzwAdKhnUbqFOnPORkDOtM 0Qw9RqN7SE2Trqa+8tdLShhOVoo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:references:to:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=pokwO5Eo8couXgQoEeCctj Lg6Uc=; b=rUt4+G1uL6BpGq6Ptudo+L6BOvMJrhWyu20CzPHf74X5BWqblQCMFK 6kiDbwLDJLyWbhdd33Meh4yYXbk1DSgkx0HUNBvOyIP8T8oa510+6NFM0Fm9MGRV Iw0/K95JbrjRm24v/7w2lWItXet3EnaxxnaoqcC3KJo1MpZJd31gE= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 X-HELO: m0.truegem.net Subject: Re: Cygwin multithreading performance References: <564E3017 DOT 90205 AT maxrnd DOT com> To: cygwin AT cygwin DOT com From: Mark Geisert Message-ID: <5650379B.4030405@maxrnd.com> Date: Sat, 21 Nov 2015 01:21:31 -0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:42.0) Gecko/20100101 Firefox/42.0 SeaMonkey/2.39 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Kacper Michajlow wrote: > Thanks for reply. And sorry for being not specific enough before. 'git > gc' is a driver which runs various git command to do cleanup in > repository. Though I'm mostly concerned about the code I linked. > Instead of 'git gc' it is better to test directly 'git repack -a -f' > and possibly on repository where it takes some time. > 'git://sourceware.org/git/newlib-cygwin.git' is good test case. > Although with bigger repositories performance hit is bigger, this is > good example to see what's going on. I appreciate that more specific info on how you experience the issue. > I'm well aware that forking on windows is problematic, but I > explicitly interested in parallelized part of execution. I don't care > about forks, while this slows things down too, they are not used in > compression process which is parallelized over the all cpu threads. > Each command is indeed forked, but I'm only interested about > pack-objects part hence the code I linked. OK, we're on the same page now :). > $ strace --mask=debug+syscall+thread -o git.strace git repack -a -f > Counting objects: 156690, done. > Delta compression using up to 12 threads. > Compressing objects: 100% (154730/154730), done. > Writing objects: 100% (156690/156690), done. > Total 156690 (delta 123449), reused 33146 (delta 0) > > $ grep "fork(" git.strace > 559 53728 [main] git 24340 fork: 24368 = fork() > 465 54022 [main] git 24368 fork: 0 = fork() > > Only two forks were created, while during compression only 25% cpu was > used (on big repo like linux kernel it doesn't exceed 8%). With native > git the same workload easily uses 95-100% cpu and therefor is a lot > faster. I was able to reproduce your issue using a cloned newlib-cygwin repo. On a 6-CPU machine I saw max 36% CPU utilization during the compression phase. ProcessExplorer showed all 6 threads were getting CPU time (to varying degrees) and when suspended they were always trying to acquire a mutex. I'd like to run some more straces and perhaps investigate with some other tools before saying more. This may take a while. What I've done so far is install the git-debuginfo and cygwin-debuginfo packages to that I can convert hex RIP addresses to line numbers. I've run the testcase under gdb so I can interrupt at random times and poke around. The straces from this testcase are ginormous so I hope I can figure out a better way to see why the compression threads aren't CPU-bound like they should be. If you don't already know, 'strace --help' shows the available mask values. The threads are each writing to disk, so I wonder if there's some unintentional serialization going on somewhere, but I don't know yet how I could verify that theory. ..mark -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple