From: Andy DOT Mortimer AT aeat DOT co DOT uk (Andy Mortimer) Subject: problem with high PIDs and ash? (was Re: Problem with wait?) 16 Jun 1998 21:01:27 -0700 Message-ID: References: To: gnu-win32 AT cygnus DOT com Cc: Ian AT kiwiplan DOT co DOT nz Andy Mortimer writes: > Dear Ian and List, > > cgf AT cygnus DOT com (Christopher G. Faylor) writes: > > Ian Collins wrote: > > >I am running NT4 SP3 with b19. > > >I keep getting an error waitforjob - no children. > > >[...] > > >Once I get this error message, the only recourse is to reboot the > > >machine. > > > > > >Please - has anyone seen this problem? I have absolutely no idea how to > > >solve this. > > > > Try running Sergey's coolview or B19.1. > > I don't know if this worked for you, Ian, but I've just started > getting this error (I've just started running large builds too, so I > don't think it's something I've just broken) with Sergey's > latest. Again, having started happening it seems to happen for most > commands -- although not all; usually I can do an "ls" without > problems, but it won't let me run any shell scripts -- until I reboot. > > Does anyone have any idea how I would go about looking for more > information on this? Or ideally, how to fix it? OK, I think I've found at least a little bit of the solution, through a couple of happy coincidences. The first occurred to me while looking at the output from straces on an identical shell script, before and after rebooting, while this problem was in evidence. There weren't any obvious errors from the cygnus stuff, but the call to wait4 was simply repeated. Then I noticed that the PID in the failed case was just a little greater than 32768. This made the problem reproducible, with the following shell script: #!/bin/sh echo $$ $0 & which simply increments the PID. Having got the PID over about 33000 (I didn't get an exact number cos the script wouldn't stop!), I again couldn't run the shell scripts. The second thing was the realisation that, although I could run "ls" from bash fine, when I did `sh -c ls' it gave the waitforjob error. So it's an sh problem. I then noted that `sh -c "ls; ls"' only gave one listing and an error, wheras `bash -c "ls; ls"' gave two listings and no error. A simple fix, therefore, is to copy sh.exe to ash.exe and then copy bash.exe over sh.exe. This seems to work for me so far, at any rate. In the longer term, it looks like there's a bug in ash. I'm downloading the source as I type, and I'll see if I can find it easily, but if anyone else fixes it I'd be interested to hear! Thanks, Andy - For help on using this list (especially unsubscribing), send a message to "gnu-win32-request AT cygnus DOT com" with one line of text: "help".