Mail Archives: cygwin/2005/07/21/19:41:25
Greetings:
Not expecting help; want to share a problem
I have seen repeatedly every week or so on 1 host (windows 2000 server w/latest
service packs and fixes), using 1.5.17.
I have not seen a pattern or a cause, but I seem to recall that
the shell that "goes south", often (always?) has several suspended
jobs - usually a mix of "vim" and "less". (Notice the child processes
of pid 6084 shown below.)
I will be running fairly routine sys admin commands in a bash session on
a remote host through an ssh session. The bash command prompt
returns after succeeding, then I type a (simple) command ("ls"
in the example below), and after the cursor moves down to the first
column in the next line, *nothing* subsequently happens. If I look for the
bash process that was running the interactive shell, with
cygwin 'ps', it is not there (pid is 6084 in example below).
Strangely, though /proc has both the bash session and it's
parent. "cat /proc/6084/ppid" shows value 1052, which is the parent
sshd process ("/usr/sbin/sshd -D -R";see example below).
It turns out cygwin's ps does not show processes 1052 nor 6084. Both
the sshd and it's child bash session mysteriously vanished, but the
child processes of the bash session remain.
( I'm not sure how much I should trust "procps -H -Ao pid,ppid,%cpu,user,bsdstart,args",
but that's a side issue. Search ahead for "defunct".)
--
regards,
Tom
--v-v------------------C-U-T---H-E-R-E-------------------------v-v--
# --------------------------------------------------------------------
# this bash session shows my checks *after* bash session w/pid
# 6084 died. 6084's ppid was 1052
# --------------------------------------------------------------------
[16:36:03 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ tty
/dev/tty1
[16:36:05 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ which ps_
ps_ is aliased to `procps -H -Ao pid,ppid,%cpu,user,bsdstart,args'
[16:36:14 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ ps_
PID PPID %CPU USER START COMMAND
3040 6084 0.0 scmcron 16:03 vim rc_startup
5240 6084 0.0 scmcron 15:33 less -I -j4 -x2 -S /var/log/rc_startup.log
4148 6084 0.0 scmcron 15:30 vim basename.shinc
3196 6084 0.0 scmcron 14:51 vim _logrotate
1624 1 0.0 SYSTEM 00:38 /usr/bin/cygrunsrv
1696 1624 0.0 SYSTEM 00:38 /usr/sbin/sshd -D
1052 1696 0.0 SYSTEM 09:43 /usr/sbin/sshd -D -R
6188 1696 0.0 SYSTEM 16:14 /usr/sbin/sshd -D -R
2768 6188 0.0 scmcron 16:14 -bash
3396 2768 0.0 scmcron 16:32 procps -H -Ao pid,ppid,%cpu,user,bsdstart,args
1028 1 0.0 SYSTEM 00:38 /usr/bin/cygrunsrv
1116 1028 0.0 SYSTEM 00:38 /usr/sbin/cron -D
[16:36:17 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ cat /proc/6084/ppid
1052
[16:36:43 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ which ps
ps is aliased to `ps -elW '
ps is /usr/bin/ps
[16:37:07 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ ps|egrep '\<1052|6084\>'
[16:37:34 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $
--v-v------------------C-U-T---H-E-R-E-------------------------v-v--
# --------------------------------------------------------------------
# Here are the last two commands typed in bash session w/pid 6084
# The "ls" command never completed.
# Sorry for the prompt, PS1 below is "\t \d \jj \l 4880 \w\n> \h \u >"
# This session has tty of tty0, and 4 suspended jobs.
# --------------------------------------------------------------------
> 16:15:07 Thu Jul 21 4j tty0 6084 /drv/c/adm/config/rc
> ourhost scmcron > regtool -s set /HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/rc_startup/Parameters/Environment/PATH_ADM 'c:'
> 16:16:01 Thu Jul 21 4j tty0 6084 /drv/c/adm/config/rc
> ourhost scmcron > ls
--v-v------------------C-U-T---H-E-R-E-------------------------v-v--
# --------------------------------------------------------------------
# change in "procps -H -Ao pid,ppid,%cpu,user,bsdstart,args" output
# cause again unknown
# (about 1 hour of time passed, and I looked at processes via a TS session and task manager)
# I waited ~10 min, and re-ran procps and the output returned to normal; ie
# all the defunct pids were replaced again w/the earlier integer values!
# --------------------------------------------------------------------
[17:30:49 Thu Jul 21 /proc/registry/HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/rc_startup/Parameters/Environment c7mkes108 scmcron
-bash-2.05b] $ ps_
PID PPID %CPU USER START COMMAND
3040 6084 0.0 scmcron 16:03 <defunct>
5240 6084 0.0 scmcron 15:33 <defunct>
4148 6084 0.0 scmcron 15:30 <defunct>
3196 6084 0.0 scmcron 14:51 <defunct>
1624 1 0.0 SYSTEM 00:38 <defunct>
1696 1624 0.0 SYSTEM 00:38 <defunct>
1052 1696 0.0 SYSTEM 09:43 <defunct>
6188 1696 0.0 SYSTEM 16:14 <defunct>
2768 6188 0.0 scmcron 16:14 <defunct>
2936 2768 0.7 scmcron 17:27 procps -H -Ao pid,ppid,%cpu,user,bsdstart,args
1028 1 0.0 SYSTEM 00:38 <defunct>
1116 1028 0.0 SYSTEM 00:38 <defunct>
[17:31:01 Thu Jul 21 /proc/registry/HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/rc_startup/Parameters/Environment c7mkes108 scmcron
-bash-2.05b] $ cat /proc/6084/ppid
1052
<snip>
[17:40:56 Thu Jul 21 /proc/registry/HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/rc_startup/Parameters/Environment c7mkes108 scmcron
-bash-2.05b] $ ps_
PID PPID %CPU USER START COMMAND
3040 6084 0.0 scmcron 16:03 vim rc_startup
5240 6084 0.0 scmcron 15:33 less -I -j4 -x2 -S /var/log/rc_startup.log
4148 6084 0.0 scmcron 15:30 vim basename.shinc
3196 6084 0.0 scmcron 14:51 vim _logrotate
1624 1 0.0 SYSTEM 00:38 /usr/bin/cygrunsrv
1696 1624 0.0 SYSTEM 00:38 /usr/sbin/sshd -D
6188 1696 0.0 SYSTEM 16:14 /usr/sbin/sshd -D -R
2768 6188 0.0 scmcron 16:14 -bash
5388 2768 0.0 scmcron 17:44 procps -H -Ao pid,ppid,%cpu,user,bsdstart,args
1028 1 0.0 SYSTEM 00:38 /usr/bin/cygrunsrv
1116 1028 0.0 SYSTEM 00:38 /usr/sbin/cron -D
--v-v------------------C-U-T---H-E-R-E-------------------------v-v--
# --------------------------------------------------------------------
# I finally killed the ssh client that had originated from a linux box..
# --------------------------------------------------------------------
> 16:15:07 Thu Jul 21 4j tty0 6084 /drv/c/adm/config/rc
> c7mkes108 scmcron > regtool -s set /HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/rc_startup/Parameters/Environment/PATH_ADM 'c:'
> 16:16:01 Thu Jul 21 4j tty0 6084 /drv/c/adm/config/rc
> c7mkes108 scmcron > ls
Killed by signal 15.
[17:41:43 Thu Jul 21 0j 19 17149 ~]
[cmke6-75 rodmant]$ #now we're back on the linux host
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -