X-Spam-Check-By: sourceware.org Message-Id: <200603290029.k2T0Tvt3019354@tigris.pounder.sol.net> To: cygwin AT cygwin DOT com From: cygwin AT trodman DOT com (Tom Rodman) Reply-to: cygwin AT cygwin DOT com Subject: informal report on "fork: Resource temporarily unavailable" incidents Date: Tue, 28 Mar 2006 18:29:57 -0600 X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Not expecting help, just sharing. We're running windows 2003 server enterprise edition on a quad processor HP proliant G3. uname -a results in [CYGWIN_NT-5.2 c7mdcs063 1.5.19(0.150/4/2) 2006-01-20 13:28 i686 Cygwin]. Yesterday I ran a script (kicked off in a makefile) to create /etc/passwd for a subset of users in our domain. It failed w/fork errors (see below). After the errors showed up I simplified the script by building and running a script like: #!/bin/bash mkpasswd -d -u username1 mkpasswd -d -u username2 mkpasswd -d -u username3 . . This script (search ahead for "r7867") had about 400 uniq usernames. When I ran it on Monday it also failed w/fork errors on random lines. When the problem was happening there were ~105 processes running. "net start|wc -l" shows 66 services. what apparently fixed the problem: This Tuesday morning I came in, I resumed and then closed the vim sessions, closed the ssh sessions, then opened a new ssh session, and then ran the more complex script 100 times in a row and it never failed. Tonight I opened the same files in vim sessions, suspended them, and then ran the job once and it worked fine. We're rebooting the server tonight - it's been up just over 1 week. Should I try a later snapshot? -- thanks much, Tom Rodman 1st for for error on Monday: --v-v------------------C-U-T---H-E-R-E-------------------------v-v-- > 16:09:13 Mon Mar 27 2j tty0 6480 /adm/config/etc > OurSrvr063 staffuser1 > make domain local clean4main passwd group clean_local if ! user_list_external=/adm/db/bcm_users/all_users /adm/bin/app/s/mkpasswd_4domain_subset >passwd.a then \ echo looks like were not in a domain, not making passwd.autogen.domain >&2 ;\ > /adm/config/etc/passwd.autogen.domain ;\ fi /adm/config/bp/bash.bp.shinc: fork: Resource temporarily unavailable --v-v------------------C-U-T---H-E-R-E-------------------------v-v-- Here's the slightly adjusted heap setting on the server: bash-3.00$ ccs=/proc/registry/HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet bash-3.00$ key="$ccs/Control/Session Manager/SubSystems/Windows" bash-3.00$ perl -lne 'print $1 if m{ (Shar.*?) }' "$key" SharedSection=1536,3584,768 snippets of subsequent fork or heap errors from Monday tests: --v-v------------------C-U-T---H-E-R-E-------------------------v-v-- > 16:20:32 Mon Mar 27 2j tty0 6480 /etc > OurSrvr063 staffuser1 > cd $cf/etc;make domain local clean4main passwd group clean_local if ! user_list_external=/adm/db/bcm_users/all_users /adm/bin/app/s/mkpasswd_4domain_subset >passwd.autogen.domain ;\ then \ echo looks like were not in a domain, not making passwd.autogen.domain >&2 ;\ > /adm/config/etc/passwd.autogen.domain ;\ fi /adm/bin/sys/s/bash_rXF_: fork: Resource temporarily unavailable 5 [main] ? (6484) child_copy: cygheap read copy failed, 0x6115B900..0x611661F4, done 0, windows pid 2291660, Win32 error 5 --snip PW_AG_DOMAIN=/adm/config/etc/passwd.autogen.domain \ PW_PATCH_DOMAIN=/adm/config/etc/passwd.patch.domain \ PW_PATCH_DOMAIN_LOCALHOST=/etc/passwd.patch.domain.localhost \ /adm/bin/app/s/passwd_mk_passwd.cust.domain >/etc/passwd.cust.domain make: vfork: Resource temporarily unavailable --snip > 16:21:04 Mon Mar 27 2j tty0 6480 /adm/config/etc > OurSrvr063 staffuser1 > cd $cf/etc;make domain local clean4main passwd group clean_local if ! user_list_external=/adm/db/bcm_users/all_users /adm/bin/app/s/mkpasswd_4domain_subset >passwd.a utogen.domain ;\ then \ echo looks like were not in a domain, not making passwd.autogen.domain >&2 ;\ > /adm/config/etc/passwd.autogen.domain ;\ fi /adm/bin/sys/s/bash_rXF_: fork: Resource temporarily unavailable 5 [main] ? (6484) child_copy: cygheap read copy failed, 0x6115B900..0x611661AC, done 0, window s pid 2291660, Win32 error 5 xargs: error waiting for child process: No child processes make: *** Deleting file `passwd.autogen.domain' --snip > 17:18:12 Mon Mar 27 0j tty1 1372 ~ > OurSrvr063 staffuser1 > head /tmp/r7867 mkpasswd -d -u "adm_ds" || echo " failed for [adm_ds]" >&2 mkpasswd -d -u "adm_ndb" || echo " failed for [adm_ndb]" >&2 mkpasswd -d -u "adm_tsr" || echo " failed for [adm_tsr]" >&2 mkpasswd -d -u "staffuser2" || echo " failed for [staffuser2]" >&2 mkpasswd -d -u "calbrec" || echo " failed for [calbrec]" >&2 mkpasswd -d -u "camundc" || echo " failed for [camundc]" >&2 mkpasswd -d -u "carmstl" || echo " failed for [carmstl]" >&2 mkpasswd -d -u "carshej" || echo " failed for [carshej]" >&2 mkpasswd -d -u "cartiof" || echo " failed for [cartiof]" >&2 mkpasswd -d -u "casmusm" || echo " failed for [casmusm]" >&2 > 17:18:14 Mon Mar 27 0j tty1 1372 ~ > OurSrvr063 staffuser1 > bash -c /tmp/r7867 >/dev/null /tmp/r7867: fork: Resource temporarily unavailable > 17:18:23 Mon Mar 27 0j tty1 1372 ~ > OurSrvr063 staffuser1 > 6 [main] ? (6484) child_copy: cygheap read copy failed, 0x6115B900..0x61 162524, done 0, windows pid 2291660, Win32 error 5 > 17:18:58 Mon Mar 27 0j tty1 1372 ~ > OurSrvr063 staffuser1 > bash -c /tmp/r7867 >/dev/null /tmp/r7867: fork: Resource temporarily unavailable 30 [main] ? (6484) child_copy: cygheap read copy failed, 0x6115B900..0x61162524, done 0, window s pid 2291660, Win32 error 5 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/