Mail Archives: cygwin/2004/05/17/09:52:23
On Mon, 17 May 2004, John P. Rouillard wrote:
> Hello:
>
> I have noticed a problem when I start X windows. As part of my
> startup, I fire up three xterms, but only one of them actually
> completes and displays a prompt.
>
> I believe there may be a race condition in the pty allocation code as
> the three bash processes all share the same tty. "ps -ef" shows:
>
> UID PID PPID TTY STIME COMMAND
> jrouilla 2332 2252 1 09:04:19 /usr/bin/bash
> jrouilla 2340 2248 1 09:04:19 /usr/bin/bash
> jrouilla 2348 2168 1 09:04:19 /usr/bin/bash
> jrouilla 2632 2332 1 09:05:01 /usr/bin/ps
>
> The one with pid 2332 I believe was the first to start based on the
> PID, but I also remember that the PID's are not monotonically
> increasing under cygwin so YMMV. However pid 2332 is the one
> (verified using echo $$) that I can interact with. The other two are
> frozen with no output or input (I entered a ^D which should have
> exited the shell).
>
> This failure usually occurs when I first log in in windows and run all
> my startup scripts. It is less likely to occur if I start up X after
> all the rest of the login processes have run, but I can provoke it
> here as well but with a lower frequency.
>
> A proper startup with three running bash/xterms looks like:
>
> UID PID PPID TTY STIME COMMAND
> jrouilla 2400 2216 1 09:11:05 /usr/bin/bash
> jrouilla 2680 2204 3 09:11:05 /usr/bin/bash
> jrouilla 2732 2188 4 09:11:06 /usr/bin/bash
> jrouilla 2712 2400 1 09:11:09 /usr/bin/ps
>
> Does the last cygwin snapshot contain any code changes in the pty
> allocation area? If so I can try it and see if it helps. I am already
> running a snapshot from 20040412-23:00:24, but both 1.5.9 and this
> snapshot have the same issue AFAICT. It's an intermittent problem for
> me, but I will be happy to provide any info I can.
>
> I have attached the cygcheck output lightly edited to hide IP
> addresses and internal groups. If you need that info to debug the
> problem, I will send unedited output on request.
>
> -- rouilj
FWIW, I can confirm that this problem has existed for a while (as long as
I can remember) -- if you fire up two xterms in quick succession,
especially under heavy load, there are good chances that they will share a
pty. The output of regular "ps" will show that the "bash" processes are
in the suspended state ("S"), and sending SIGCONT doesn't work. The first
xterm (judging by the window position) is always the one getting the
suspended bash. Also, it seems to happen more often when the shortcut to
start the xterm hasn't been used in a while (evicted from disk cache?), so
this makes it hard to reproduce the problem twice in a row.
I believe I've reported this before, but couldn't come up with a small
reproducible testcase (although I just managed to reproduce it on my
machine -- Win2kPro SP3, Cygwin 1.5.9 -- again, using the above recipe).
It's annoying enough that I'd like to try debugging it. Of course, as
with most races, running the xterms under strace fixes it... Attaching to
a hung bash is, IMO, useless, as all of the pty assignments have already
happened by that point. Any pointers on how to catch this in the act
are appreciated.
Igor
--
http://cs.nyu.edu/~pechtcha/
|\ _,,,---,,_ pechtcha AT cs DOT nyu DOT edu
ZZZzz /,`.-'`' -. ;-;;,_ igor AT watson DOT ibm DOT com
|,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D.
'---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow!
"I have since come to realize that being between your mentor and his route
to the bathroom is a major career booster." -- Patrick Naughton
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -