X-Spam-Check-By: sourceware.org Date: Fri, 1 Sep 2006 08:37:09 -0700 From: clayne AT anodized DOT com To: cygwin AT cygwin DOT com Subject: Re: cygwin fork() Message-ID: <20060901153709.GC7663@ns1.anodized.com> References: <20060901100138 DOT GA7444 AT ns1 DOT anodized DOT com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.11 X-Assp-Spam-Prob: 0.00000 X-Assp-Whitelisted: Yes X-Assp-Envelope-From: clayne AT ns1 DOT anodized DOT com X-Assp-Intended-For: cygwin AT cygwin DOT com X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Fri, Sep 01, 2006 at 10:33:47AM -0400, Igor Peshansky wrote: > While Cygwin is an *emulation* layer, and emulation is inherently slower > than straight execution, there are other potential reasons for the > slowness. Check your anti-virus and firewall software settings. If > possible, exclude the Cygwin filesystem from checking by those tools... > Even little things (like making the tool aware of the often-used Cygwin > programs and telling it to not check "outbound email messages" (!) sent by > those programs) can help speed up Cygwin... It's definitely none of those as I don't run any firewall or antivirus software whatsoever on this box. Windows 2003 Server, minimal set of services. The machine literally sits at 0% CPU unless I'm using it. > A one-sentence rant: even with those changes, whenever I run a Cygwin > shell script, the stupid vsmon process takes up 100% of the CPU. No known > solution. Sigh... Don't know what vsmon is.. A likely solution sounds to get rid of vsmon ;). Also, not to break out the ole Linux vs Cygwin time comparisons type deal, because we all know hot it's apples and oranges, in this case it's a bit absurd a difference and I kinda have to. On a Celeron 2gz Linux box w/512M RAM (which is WAY slower than a dual core Opteron 180 w/ 4GB of RAM): real 0m11.507s user 0m9.100s sys 0m2.320s 11s! For the same "make -j1" vs the 2m20s I got with "make -j1" on my 20060718 Cygwin box. 11s is about 1/4th the time of the best I could get my opteron to do with 'make -j8'. This is with all the exact same code. Now we know libtool is intensive, but the libtool being used is the exact same script (the build script is running a local generated version e.g. "/bin/sh ../libtool "). All I did was copy the exact same build directory to my linux host, cleaned it out, and ran the tests. Yes apples to oranges for sure, but let's look at some figures here: Linux 2.6.17.11, Celeron 2ghz/512MB/IDE drive, 11.507s (make -j1) Cygwin 20060718, Opteron 180 AT 2 DOT 8ghz/4GB/U320SCSI, 2m20s (make -j1) Cygwin 20060718, Opteron 180 AT 2 DOT 8ghz/4GB/U320SCSI, 45.654s (make -j8) This is what specifically makes me wonder what is up. Just one small segment of the build which I copied the commands from one to the other: Linux (Celeron): [clayne AT ns1 lib]$ time if /bin/sh ../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -Os -MT network.lo -MD -MP -MF ".deps/network.Tpo" -c -o network.lo network.c; then mv -f ".deps/network.Tpo" ".deps/network.Plo"; else rm -f ".deps/network.Tpo"; exit 1; fi gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -Os -MT network.lo -MD -MP -MF .deps/network.Tpo -c network.c -fPIC -DPIC -o .libs/network.o gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -Os -MT network.lo -MD -MP -MF .deps/network.Tpo -c network.c -o network.o >/dev/null 2>&1 real 0m1.254s user 0m1.140s sys 0m0.110s [clayne AT ns1 lib]$ time gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -Os -MT network.lo -MD -MP -MF .deps/network.Tpo -c network.c -fPIC -DPIC -o .libs/network.o real 0m0.549s user 0m0.520s sys 0m0.030s [clayne AT ns1 lib]$ time gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -Os -MT network.lo -MD -MP -MF .deps/network.Tpo -c network.c -o network.o >/dev/null 2>&1 real 0m0.522s user 0m0.510s sys 0m0.010s Cygwin (Opteron): $ time if /bin/sh ../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -Os -MT network.lo -MD -MP -MF ".deps/network.Tpo" -c -o network.lo network.c; then mv -f ".deps/network.Tpo" ".deps/network.Plo"; else rm -f ".deps/network.Tpo"; exit 1; fi gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -Os -MT network.lo -MD -MP -MF .deps/network.Tpo -c network.c -DPIC -o .libs/network.o gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -Os -MT network.lo -MD -MP -MF .deps/network.Tpo -c network.c -o network.o >/dev/null 2>&1 real 0m8.032s user 0m1.166s sys 0m2.216s $ time gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -Os -MT network.lo -MD -MP -MF .deps/network.Tpo -c network.c -DPIC -o .libs/network.o real 0m0.506s user 0m0.249s sys 0m0.061s $ time gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../lib -march=pentium4 -mtune=pentium4 -mcpu=pentium4 -Os -MT network.lo -MD -MP -MF .deps/network.Tpo -c network.c -o network.o >/dev/null 2>&1 real 0m0.506s user 0m0.233s sys 0m0.062s Obviously it's not gcc, or the execution of binary code that is slow, Same libtool, different systems, 8 times slower. Just as a crude test, I dumped the output of that same libtool command but with /bin/sh -x piped into awk '{print systime(),$0}' to try and see any possible obvious violators. Everything moving and then: 1157123322 + echo blah1 1157123322 blah1 <3 seconds of doing absolutely nothing here> 1157123325 + test -z '' 1157123325 + echo blah2 1157123325 blah2 The literal script (with my added echos) is: <-- snip --> exec_cmd= echo blah1 if test -z "$show_help"; then echo blah2 <-- snip --> Truely bizarre. I straced the entire thing and this is what I see: + echo blah1 235 8809138 [main] sh 3344 sig_send: sendsig 0x6E8, pid 3344, signal -34, its_me 1 28 8809166 [main] sh 3344 sig_send: wakeup 0x660 28 8809194 [main] sh 3344 sig_send: Waiting for pack.wakeup 0x660 4 8809198 [sig] sh 3344 wait_sig: signalling pack.wakeup 0x660 55 8809253 [main] sh 3344 sig_send: returning 0x0 from sending signal -34 47 8809300 [main] sh 3344 fhandler_base::write: binary write blah1 523 8809823 [main] sh 3344 sig_send: sendsig 0x6E8, pid 3344, signal -34, its_me 1 29 8809852 [main] sh 3344 sig_send: wakeup 0x660 28 8809880 [main] sh 3344 sig_send: Waiting for pack.wakeup 0x660 3 8809883 [sig] sh 3344 wait_sig: signalling pack.wakeup 0x660 54 8809937 [main] sh 3344 sig_send: returning 0x0 from sending signal -34 27 8809964 [main] sh 3344 sig_send: sendsig 0x6E8, pid 3344, signal -34, its_me 1 27 8809991 [main] sh 3344 sig_send: wakeup 0x660 27 8810018 [main] sh 3344 sig_send: Waiting for pack.wakeup 0x660 4 8810022 [sig] sh 3344 wait_sig: signalling pack.wakeup 0x660 53 8810075 [main] sh 3344 sig_send: returning 0x0 from sending signal -34 <--- point of repetition ---> 25 8810100 [main] sh 3344 readv: readv (255, 0x22BFC0, 1) blocking, sigcatchers 1 26 8810126 [main] sh 3344 readv: no need to call ready_for_read 28 8810154 [main] sh 3344 fhandler_base::read: read 0 bytes () 25 8810179 [main] sh 3344 fhandler_base::read: returning 1, text mode 26 8810205 [main] sh 3344 readv: 1 = readv (255, 0x22BFC0, 1), errno 0 112 8810317 [main] sh 3344 sig_send: sendsig 0x6E8, pid 3344, signal -34, its_me 1 27 8810344 [main] sh 3344 sig_send: wakeup 0x660 28 8810372 [main] sh 3344 sig_send: Waiting for pack.wakeup 0x660 4 8810376 [sig] sh 3344 wait_sig: signalling pack.wakeup 0x660 52 8810428 [main] sh 3344 sig_send: returning 0x0 from sending signal -34 27 8810455 [main] sh 3344 sig_send: sendsig 0x6E8, pid 3344, signal -34, its_me 1 27 8810482 [main] sh 3344 sig_send: wakeup 0x660 27 8810509 [main] sh 3344 sig_send: Waiting for pack.wakeup 0x660 4 8810513 [sig] sh 3344 wait_sig: signalling pack.wakeup 0x660 53 8810566 [main] sh 3344 sig_send: returning 0x0 from sending signal -34 <--- endless repetition continues until ---> + test -z '' 182 56491808 [main] sh 3344 sig_send: sendsig 0x6E8, pid 3344, signal -34, its_me 1 30 56491838 [main] sh 3344 sig_send: wakeup 0x660 28 56491866 [main] sh 3344 sig_send: Waiting for pack.wakeup 0x660 4 56491870 [sig] sh 3344 wait_sig: signalling pack.wakeup 0x660 52 56491922 [main] sh 3344 sig_send: returning 0x0 from sending signal -34 28 56491950 [main] sh 3344 sig_send: sendsig 0x6E8, pid 3344, signal -34, its_me 1 27 56491977 [main] sh 3344 sig_send: wakeup 0x660 27 56492004 [main] sh 3344 sig_send: Waiting for pack.wakeup 0x660 4 56492008 [sig] sh 3344 wait_sig: signalling pack.wakeup 0x660 53 56492061 [main] sh 3344 sig_send: returning 0x0 from sending signal -34 25 56492086 [main] sh 3344 set_signal_mask: oldmask 0x0, newmask 0x0, mask_bits 0x0 27 56492113 [main] sh 3344 set_signal_mask: not calling sig_dispatch_pending 1761 56493874 [main] sh 3344 sig_send: sendsig 0x6E8, pid 3344, signal -34, its_me 1 28 56493902 [main] sh 3344 sig_send: wakeup 0x660 28 56493930 [main] sh 3344 sig_send: Waiting for pack.wakeup 0x660 4 56493934 [sig] sh 3344 wait_sig: signalling pack.wakeup 0x660 53 56493987 [main] sh 3344 sig_send: returning 0x0 from sending signal -34 47 56494034 [main] sh 3344 fhandler_base::write: binary write + echo blah2 222 56494256 [main] sh 3344 sig_send: sendsig 0x6E8, pid 3344, signal -34, its_me 1 29 56494285 [main] sh 3344 sig_send: wakeup 0x660 28 56494313 [main] sh 3344 sig_send: Waiting for pack.wakeup 0x660 4 56494317 [sig] sh 3344 wait_sig: signalling pack.wakeup 0x660 56 56494373 [main] sh 3344 sig_send: returning 0x0 from sending signal -34 47 56494420 [main] sh 3344 fhandler_base::write: binary write blah2 I counted up the number of bytes that section of repetition took in the strace output, and divided it by a single rep's size: It does that same readv() blocking, sig_send, wait_sig loop about 115,012 times. -- Back in May, I did these timed runs for a similar build (albeit a larger directory) and arrrived at these figures for a make -j1: The dates are the cygwin snapshot version. Actual times seen: 20060309: real 1m47.328s user 0m17.074s sys 0m59.145s 20060318: real 3m20.953s user 0m11.703s sys 0m45.895s 20060314: real 3m21.594s user 0m11.713s sys 0m45.894s 20060313: real 3m21.594s user 0m11.715s sys 0m46.134s -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/