Mail Archives: djgpp-workers/2001/04/23/12:22:45
On Mon, 23 Apr 2001, The Owl wrote:
> > This meand that Bash is also part of the picture, since the commands 
> > involve pipes and quoting.  It might be interesting to see if setting 
> > SHELL to point to COMMAND.COM will change the phenomena you observe, 
> > because Bash participation in this means you have one more level of 
> > nesting of DPMI programs.
> 
> anything that makes the nesting level less than 2 below 'make' (ie.
> 'make' would spawn 'gcc' directly, not via 'bash') should 'fix' the
> problem
I understand that, I just thought it would make sense to actually see if 
the crashes go away when you remove SHELL=sh.exe from djgpp.env.  I 
usually like to be sure we are on the right track with these tricky 
problems, so reality checks from time to time don't hurt.  But if you 
already checked that, or if you are sure testing this is redundant, I'm 
happy.
> > I can easily produce such a change in dosexec.c and send you the
> > diffs, but do you have a convenient way of rebuilding Make with the
> > modified libc.a?
> 
> yes, i can pretty much rebuild everything, i just did not want to
> patch libc/dosexec myself as i am not familiar with the djgpp port
> and hence wanted to let the more knowledgable guys produce the fix.
Sure.  I will send you a patch for dosexec.c ASAP.
> > (If this works with Make and your cmd1 and cmd2, it will probbaly make
> > sense to rebuild gcc and binutils as well, and maybe Bash, and try some
> > deeply nested builds of complicated packages.)
> 
> yes, this is right, pretty much everything must be rebuilt that
> statically links to libc and executes other dpmi apps ;-(. maybe i can
> convince myself and will produce some easier to use workaround which
> would be a patch or extension to ntvdm or dosx - i will see how much
> time/mood i will have at the next weekend.
I don't think this is worth your while (unless you want to do this 
regardless).  I don't see anything terribly bad in having to rebuild all 
the DJGPP ports with the modified dosexec: this change will be part of 
the next DJGPP v2.04 release, and the current development sources already 
include at least one new feature (support for symlinks) which requires to 
rebuild all the ports anyway.
I only mentioned the need for a rebuild because it will be an additional 
burden on you or anyone else who would want to test this change with more 
than your single Makefile.
> > From your description, it sounds like we will still have a (much
> > narrower) window of opportunity--between the time NTVDM resets
> > _CurrentPSPSelector to zero and the time the parent calls 21/50--where
> > any interrupt that has to be reflected will crash.  Is that right?
> 
> no. this is because the exception stack gets freed only when a dpmi
> app exits while _CurrentPSPSelector is 0, which in turn gets set
> to 0 after any dpmi app exits.
Right.
> in plain english, you need *two* dpmi
> apps exit right after each other to get ntvdm free the exception
> stack. if we ensure that _CurrentPSPSelector is never 0 before a
> dpmi app exits, we solved the problem, there will be no chance for
> ntvdm to free the exception stack.
This means we need to make sure the parent doesn't exit before it gets a 
chance to call 21/50.  One possibility that the parent could exit right 
away is if the user pressed Ctrl-C or Ctrl-BREAK during the time the 
child was running.  If the parent didn't mask SIGINT, it will abort as 
soon as it touches its data (this passing of SIGINT up the process group 
is a feature of DJGPP).  I will have to think how to do this; suggestions 
welcome.
> of course, if a dpmi app itself directly executes other dpmi apps
> without using libc, we get the problem
I don't think we should be worried by such a scenario: we can always tell 
the users ``then don't do that''.
> actually, in this case someone wanted to be nice and decided that
> he would clean up allocated memory that dpmi apps failed to, but at the
> same time he did not consider that dpmi apps could be nested.
So you are saying they wanted to avoid leaking DPMI selectors, which is 
known to happen on NT4, and introduced this bug in the process.  Yes, 
that would make sense.
- Raw text -