Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Subject: RE: OT: possible project/research project MIME-Version: 1.0 Date: Wed, 20 Mar 2002 20:33:21 +1100 Content-Type: text/plain; charset="us-ascii" Message-ID: X-MimeOLE: Produced By Microsoft Exchange V6.0.5762.3 X-MS-Has-Attach: content-class: urn:content-classes:message X-MS-TNEF-Correlator: From: "Robert Collins" To: "Randall R Schulz" , Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id g2K9Xp617882 Randall, responses inline.. > -----Original Message----- > From: Randall R Schulz [mailto:rrschulz AT cris DOT com] > Sent: Wednesday, March 20, 2002 7:34 PM > >Well we still have that basic separate - bash's builtin's > for example. > >If > >it's not builtin, it needs a sub process. > > That's not quite right. Built-ins still need sub-processes if > they're going > to operate in a pipeline or are enclosed within parentheses. Ok. So if it's not builtin, or it's a builtin that needs to be pipelined/parentisised it requires a sub-process. That sounds like something that a patch to the relevant shell might provide some easy wins. > >sub process's after all) - but we have the source so.... > > How will your magical push_context protect from wild pointer > references, e.g.? If that becomes a problem, I'd suggest that dll's get loaded on page boundaries and we protect the non-permitted address space with read-only, and install an exception handler that unprotects and restores context. It may be that handling that is not worth the development time - so reliability could be an issue. > >The fork()/exec() model bites. Sorry, but it does. fork() > based servers > >for instance run into the galloping herd - and scale very > badly. The other > >use for fork -the fork/exec combination is better achieved > with spawn() > >which is designed to do just that one job well. It also > happens to work > >very well on cygwin, and I see no reason to change that. So > spawned apps > >will remain completely separated and independent. > > Servers are not shells. Why should they fork at all? That's > what threads > are for. It's also why CGI (without something like mod_perl) > is not a good > thing and the Java server model has significant advantages. Exactly... my point is that the fork/exec model has no innate use. vfork/execve does - which is what spawn (look under posix_spawn() for the offical spawn these days) accomplishes. > Are you planning on incorporating your scheme into every > program that runs > sub-processes on a regular basis? How likely is it that what > works in one > shell will work in another or in a server? No. I'm not trying to create a new operating environment, I'm trying to address a common-case issue. If I can get certain configure scripts to run in under 30 minutes on my machine here, I'd be very happy. As for portability to different shells, or even to servers, I'd suggest that keeping the API very simply and clean - much like the sub process model is simple and clean would encourage such re-use. > I don't know the details of spawn(). How does it accomplish > I/O redirection? int posix_spawn(pid_t *restrict pid, const char *restrict path, const posix_spawn_file_actions_t *file_actions, const posix_spawnattr_t *restrict attrp, char *const argv[restrict], char *const envp[restrict]); Is the prototype. If file_actions is null, the the new process gets a copy of the parents fd table. If it's not null, then it provides the fd table for the new process. > Obviously if you add something, the old stuff isn't > (necessarily) lost. I'm > just saying that the fork/exec process model is simple, > elegant, available, > universal and fully functional in all POSIX systems. Your > model is a horse > of another color and any given command that would avail itself of the > supposed benefits of your scheme must be recast into a library that > conforms to the requirements of your embedded task model. Yes. Which is a significant impediment right from the word go. Which should go some way to explaining my ambivalence on this idea. However the building blocks to use this model are present and functional on all POSIX systems, so there's no reason to assume we couldn't 'make it work'. > It doesn't prevent it, but to avail ones self of the putative > benefits of > your proposed scheme, a significantly different programming > model has to be > learned and used. All for what? A tiny incremental > improvement in program > start-up times on a single platform and one or two > pre-ordained shells? Huh? That's an assumption. I'd hope I could achieve librarisation as simply as casting main to lib_main, and providing link time replacements for exit() and _exit() and fatal(). Then the real-binary doesn't use those link time replacements. > How much time do they save? That's for you to claim and > substantiate. I'm > not trying to justify or validate your project, I'm trying to > repudiate it. I can tell. I'm not trying to defend it, as that assumes that it is defendable. I'm discussing it in a neutral (ish) light, I hope. I am trying to provide responses to the specific points you make as part of that discussion. > But consider this: By the time you complete this task, the > upward march of > system speeds (CPU and I/O) will probably have done more to improve > elapsed-time performance of command invocation than your > improvements are > going to achieve. Straw poll, who here has and uses a machine more than 2 years old right now? My hand goes up, as does my girlfriends, and my firewall. (My PC happens to be a dual processor, but still). Also, consider that as system speeds increase, so does the functionality. We may find MS polling internet servers on process startup or something equally ridiculous that drastically increase process startup speed. Certainly system policies now play a part, as each process startup has to be tested against an arbitrarily long list of rules. And don't talk about virus scanners. > And five staff-minutes per user per month? You think that's > significant? > What would you do with those five minutes spread throughout > the month? > That's right: Nothing, 'cause you'd get it in > fraction-of-a-second parcels. Well that's an assumption. For me, I'd get it running configure scripts, which is in far bigger chunks than fraction of a second. > Lastly, you'll have to have an ongoing effort to port changes > from the > stand-alone original versions of the commands to your > embedded counterparts. No - sounds like you haven't been paying attention. In my very first email I pointed out that this was not an acceptable approach, and that committing changes upstream would be the only meaningful way of doing this. > >I'd guess at ash, as that's the smallest shell we have, but if it's > >easier > >with bash, then I see no reason not to - as this would be a /bin/sh > >replacement - if the benefits were to be realised. > > How many people use such a bare-bones shell? Unless you > modify them all, > there will be a sizeable user contingent that does not > benefit from your > efforts. Nearly everyone here does - most scripts have #!/bin/sh in the header. > I think you need a good technical justification for the effort you'll > expend relative to the benefits you're going to gain and the > detriments > you're going to incur. Absolutely. The problem domain needs further refinement, a lit search is needed, some rough test cases /mock upss to provide a rule-of-thumb idea about the potential returns, cygwin needs serious profiling to understand if my assumptions about performance are correct. Lotsa work to do this right. > As with all optimizations, you must measure the cost of the > current code > and that of replacement. In this case, you could possibly > mock up a test > jig that did DLL loading and compare that with the cost of > fork / exec. But > that would not include the unknown costs of your putative > push_context / > pop_context mechanism. Absolutely. In fact " Rules of Optimization: Rule 1: Don't do it. Rule 2 (for experts only): Don't do it yet. - M.A. Jackson "More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity." - W.A. Wulf "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." - Donald Knuth "The best is the enemy of the good." - Voltaire " With assembly credit to http://www-2.cs.cmu.edu/~jch/java/optimization.html > "The proof of the pudding is in the eating." So until you've > done it, you > won't know for an empirical fact if it's a win and if so how > much of a win > it is. Sure. Rob -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/