delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2015/11/21/05:53:24

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; q=dns; s=
default; b=POI6QblbWN6+co9PfHoInDX7xZ1frfaYRxdzdQrqS6NYayuYQYADY
3g/ROHnaR6YGr4KkO+eJVnby9/uCD4HMncX1a1rTcki5Ku+t7SJiPsfEjezUf1Jw
E2WASGOyf6pCPdqEhYClieqKDHvPfxo6CHlNRF+NkQ/EBQPnO6trqU=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:to:subject:message-id:reply-to
:references:mime-version:content-type:in-reply-to; s=default;
bh=gILOjKyggmOIHoR2NAYVIzUtjYY=; b=lC0iKocwWHmnwNrPxYgDajF/6A+1
kkOpMUIj5LOUKwZ6AhJyYvu2NZoCH2m2kJcN00zr6zus+Qzo5uLwE3qK0ebZNf9H
Ky4B02bpRZ86mVBJE5d0E8KlyIJJMAgeFZc9cUc+0L1TktMbIKyM1duMxT5ap9i6
56n9s/AywbTMr6I=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-4.1 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2
X-HELO: calimero.vinschen.de
Date: Sat, 21 Nov 2015 11:53:01 +0100
From: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Cygwin multithreading performance
Message-ID: <20151121105301.GE2755@calimero.vinschen.de>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <CABPLASTtRK4mNxh0M_AnZgjJQ15kWPx+L=U=VCU3Wwi7jV_57A AT mail DOT gmail DOT com> <564E3017 DOT 90205 AT maxrnd DOT com> <CABPLASTLrH_udLuu2F-m5P6dkENW1Z4YHEudp4NG0-FGLJgPMg AT mail DOT gmail DOT com> <5650379B DOT 4030405 AT maxrnd DOT com>
MIME-Version: 1.0
In-Reply-To: <5650379B.4030405@maxrnd.com>
User-Agent: Mutt/1.5.23 (2014-03-12)

--k3qmt+ucFURmlhDS
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Nov 21 01:21, Mark Geisert wrote:
> Kacper Michajlow wrote:
> >Thanks for reply. And sorry for being not specific enough before. 'git
> >gc' is a driver which runs various git command to do cleanup in
> >repository. Though I'm mostly concerned about the code I linked.
> >Instead of 'git gc' it is better to test directly 'git repack -a -f'
> >and possibly on repository where it takes some time.
> >'git://sourceware.org/git/newlib-cygwin.git' is good test case.
> >Although with bigger repositories performance hit is bigger, this is
> >good example to see what's going on.
>=20
> I appreciate that more specific info on how you experience the issue.
>=20
> >I'm well aware that forking on windows is problematic, but I
> >explicitly interested in parallelized part of execution. I don't care
> >about forks, while this slows things down too, they are not used in
> >compression process which is parallelized over the all cpu threads.
> >Each command is indeed forked, but I'm only interested about
> >pack-objects part hence the code I linked.
>=20
> OK, we're on the same page now :).
>=20
> >$ strace --mask=3Ddebug+syscall+thread -o git.strace git repack -a -f
> >Counting objects: 156690, done.
> >Delta compression using up to 12 threads.
> >Compressing objects: 100% (154730/154730), done.
> >Writing objects: 100% (156690/156690), done.
> >Total 156690 (delta 123449), reused 33146 (delta 0)
> >
> >$ grep "fork(" git.strace
> >   559   53728 [main] git 24340 fork: 24368 =3D fork()
> >   465   54022 [main] git 24368 fork: 0 =3D fork()
> >
> >Only two forks were created, while during compression only 25% cpu was
> >used (on big repo like linux kernel it doesn't exceed 8%). With native
> >git the same workload easily uses 95-100% cpu and therefor is a lot
> >faster.
>=20
> I was able to reproduce your issue using a cloned newlib-cygwin repo. On a
> 6-CPU machine I saw max 36% CPU utilization during the compression phase.
> ProcessExplorer showed all 6 threads were getting CPU time (to varying
> degrees) and when suspended they were always trying to acquire a mutex.  =
I'd
> like to run some more straces and perhaps investigate with some other too=
ls
> before saying more.  This may take a while.
>=20
> What I've done so far is install the git-debuginfo and cygwin-debuginfo
> packages to that I can convert hex RIP addresses to line numbers.  I've r=
un
> the testcase under gdb so I can interrupt at random times and poke around.
> The straces from this testcase are ginormous so I hope I can figure out a
> better way to see why the compression threads aren't CPU-bound like they
> should be.  If you don't already know, 'strace --help' shows the available
> mask values.  The threads are each writing to disk, so I wonder if there's
> some unintentional serialization going on somewhere, but I don't know yet
> how I could verify that theory.

If I'm allowed to make an educated guess, the big serializer in Cygwin
are probably the calls to malloc, calloc, realloc, free.  We desperately
need a new malloc implementation better suited to multi-threading.


Corinna

--=20
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

--k3qmt+ucFURmlhDS
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJWUE0NAAoJEPU2Bp2uRE+gVBAQAI3oo0LDIzSQjSbb9OzyUF7u
hro/ZOqJ8ahh2Rte/1TfiEOlqIOq3AEx9gGrWN5xDstro4VEhR3zVsOhwmC9p7n8
ioXHfvk9GHlMLsXbiFuZyGxyPv7vpRDMpRT3EEJIgat+OiRNs0R2jAQdIWPvot3d
2RJt1sA+I5G/XcLRmkSZs6IOVDaRvJ5kJUQErjIdZm8m4UVPnShNKlN0nFfYW+3Y
7yLQaja5p1I6sBIIMuMVX7KpylB0fLZI//7KfLHqq1BrgEN1cd+zulI17/F8HLzW
UA9lXx5xsrt40+mPnRU8aFi2e47cwXH8K7ANh0B1/VI3JwcJ1Ocf5Lenz6pVDAym
GfJNtysSOkHEpzcTlxqypXSCP9T6Gxs0oBLIVVhtspJ8lKDqSlzVx8pDdV4KEO0c
6ROI1WGFoIM4Gt6uvLoGOVj6wsJH4lKfl85tOUEj0QAaY9OrBrQyqp54ZqUDrHWg
rLzESmYw7CKSe+0OQ1gvskDkP2a0ji9izVmrwV79oy9UaIqV6dn+QSSQ90cJUcqz
NPQZcFwe+vxiGjr23AHX4nYb26lzWvDbF/JrBR194ouu2bDLtpCnpLuyno8RNOXU
lHW0ycWuyL1bMfc9S94PjiJItz9DjHwY/7/VsjILyhumFX3UeqDSuKDZJM7N30OQ
buYJ1L0XfuCE3Vvgj82J
=Gbrj
-----END PGP SIGNATURE-----

--k3qmt+ucFURmlhDS--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019