X-Spam-Check-By: sourceware.org Message-ID: <46C3364A.5030502@alink.co.za> Date: Wed, 15 Aug 2007 18:22:18 +0100 From: George User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: Trouble with perl fork() and exec() References: <46C32C2F DOT 6030505 AT alink DOT co DOT za> In-Reply-To: <46C32C2F.6030505@alink.co.za> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com I've just found something in an strace: **********************************************^M Program name: C:\cygwin\bin\perl.exe (windows pid 5936)^M App version: 1005.24, api: 0.156^M DLL version: 1005.24, api: 0.156^M DLL build: 2007-01-31 10:57^M OS version: Windows NT-5.2^M Date/Time: 2007-08-15 18:04:46^M **********************************************^M 114 376 [main] perl (5936) child_copy: cygheap - hp 0x67C low 0x611668E0, high 0x6116BBF8, res 1^M 47 423 [main] perl (5936) child_copy: done^M 70 493 [main] perl (5936) open_shared: name (null), n 4, shared 0x60000000 (wanted 0x60000000), h 0xEC^M 99 592 [main] perl (5936) heap_init: heap base 0x10410000, heap top 0x10760000^M 62 654 [main] perl (5936) open_shared: name (null), n 1, shared 0x60010000 (wanted 0x60010000), h 0xF0^M 43 697 [main] perl (5936) user_shared_initialize: opening user shared for '' at 0x60010000^M 44 741 [main] perl (5936) user_shared_initialize: user shared version B1D50001^M 58 799 [main] perl (5936) open_shared: name (null), n 2, shared 0x60040000 (wanted 0x60040000), h 0xF4^M 186 985 [main] perl (5936) open_shared: name Global\cygwin1S4.cygpid.5936, n 5936, shared 0x60030000 (wanted 0x60030000), h 0x768^M 54 1039 [main] perl 5936 set_myself: myself->dwProcessId 5936^M 84 1123 [main] perl 5936 child_copy: dll data - hp 0x67C low 0x61100000, high 0x61104BA0, res 1^M 12277544 12278667 [main] perl 5936 child_copy: dll bss - hp 0x67C low 0x6113F000, high 0x611483D0, res 1^M 6188 12284855 [main] perl 5936 child_copy: user heap - hp 0x67C low 0x10410000, high 0x10760000, res 1^M 92 12284947 [main] perl 5936 child_copy: done^M 108 12285055 [main] perl 5936 child_copy: data - hp 0x67C low 0x408000, high 0x408010, res 1^M 98 12285153 [main] perl 5936 child_copy: bss - hp 0x67C low 0x40A000, high 0x40A0F0, res 1^M 56 12285209 [main] perl 5936 child_copy: done^M It would seem the dll bss copy is taking 12 seconds. The machine running this is a quad xeon with 16Gb ram so it shouldn't have any issue with power (cpu is very low). Can I give any data to help debug this? George wrote: > Hello, > > I have a cygwin installation under which I'm running the > Net::Server::Fork daemon "munin-node". For those not aware, munin is a > monitoring system which is really easy to use and configure > (http://munin.projects.linpro.no). > > That said, it's not working properly. > > Here's the trouble I'm having - maybe somebody has seen it before and > can push me in the right direction? > > The basic flow of the daemon is: > > > > > > > > > > > Now, this is breaking unfortunately, so I never get to sing songs and > drink beer. > > What seems to happen is the child gets fork()ed and then the plugin code > exec()'d. The data then comes back up the line via the STDIN to the > parent, however, despite the child finishing execution (I've made sure > all sockets are closed and even tried a die()) it never exits. > > I've made the sure the data is actually coming back by putting a print > in the while loop and that shows that it's coming back from the child. > All the data makes it back, but the while loop doesn't finish and the > timeout alarm hits, so the child gets reaped. When it's reaped it > returns "Interrupted system call". > > Ive tried replacing the exec() with a dirty hack of system();exit(); but > exactly the same thing happens. > > The relevant code which does the running of the plugin is below: > > (Full code: > http://munin.projects.linpro.no/browser/branches/1.2-stable/node/munin-node.in) > > > > .. > print "# Forking .. \n" if $DEBUG; > if ($child = open (CHILD, "-|")) { > eval { > local $SIG{ALRM} = sub { $timed_out=1; die "$!\n"}; > alarm($timeout); > while() { > #last if $_ eq "# DONE"; > if ($_ eq "# DONE") { close(CHILD); } > push @lines,$_; > print "#DEBUG CHILD: $_" if $DEBUG; > } > print "# Finished gathering data from Child\n" if $DEBUG; > }; > if( $timed_out ) { > print "# Child timed out - calling reap_children $@ \n" if > $DEBUG; > reap_children($child, "$service $command: $@"); > close (CHILD); > return (); > } > unless (close CHILD) > { > if ($!) > { > # If Net::Server::Fork is currently taking care of reaping, > # we get false errors. Filter them out. > unless (defined $autoreap and $autoreap) > { > logger ("Error while executing plugin \"$service\": $!"); > } > } > else > { > logger ("Plugin \"$service\" exited with status $?. > --@lines--"); > } > } > else { > if ($child == 0) { > my $timenow = localtime(); > print "# Child forked as $$ - $timenow\n" if $DEBUG; > # New process group... > POSIX::setsid(); > > .. > > .. > > print "# Execing $servicedir/$service $command\n" if $DEBUG; > exec ("$servicedir/$service", $command); > > .. > > -- > Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple > Problem reports: http://cygwin.com/problems.html > Documentation: http://cygwin.com/docs.html > FAQ: http://cygwin.com/faq/ > -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/