Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Message-Id: <5.1.0.14.2.20020319073003.00ac4960@pop3.cris.com>
Date: Tue, 19 Mar 2002 17:15:13 -0800
To: "Robert Collins" <robert DOT collins AT itdomain DOT com DOT au>, <cygwin AT cygwin DOT com>
From: Randall R Schulz <rrschulz AT cris DOT com>
Subject: Re: OT: possible project/research project
In-Reply-To: <FC169E059D1A0442A04C40F86D9BA76062DA@itdomain003.itdomain.
 net.au>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: 8bit

Robert,

This idea isn't really new. I remember people talking about it back in the 
System 6, System 7 and 32v days, when programs were starting to get bigger, 
disks were still pretty slow, main store rather small and there was not yet 
a copy-on-write fork(2) or a vfork(2). (Not to mention the meager control 
flow in the pre-Bourne shell that used a sub-process to effect a seek(2) on 
the standard input being interpreted by the shell running the script!!!)

The problem is that you're creating a huge project that creates no new 
functionality and that has horrendous maintainence issues, as you say.

The library conversion idea is kind of a throwback to pre-Unix days or to 
systems like VMS (if I recall and understand it properly). In these systems 
there were "blessed" commands understood by the command interpreter and 
endowed with a more direct means of invocation. Other commands required 
full sub-process creation.

I trust it's your intent that the user will see no obvious differences in 
invoking these programs, but you may find full transparency harder to 
achieve than you expect. Will the full range of shell features be available 
to these specially integrated commands? Will you be able to pipe into and 
out of them? Will they work within parentheses? In procedures? Will you 
allow all shell features (pipes, say) are applied to arbitrary combinations 
of conventional and integrated commands?

In your example of a `backquote command` (which I prefer to invoke via $( 
... ) using BASH) you'd be exposed to any unintended side-effects within 
the backquote command. Side-effects like file descriptor alterations, 
changes in signal dispositions, receipt of signals or exceptions (expected 
or the result of a programming error).

The beauty of the fork/exec model with entirely separated programs _is_ 
their self-containedness and the complete independence and isolation each 
of the programs gets from each other and from the program(s) that invoke 
them. It is also nice in that it is a very simple programming model for 
commands, both built-in and end-user-supplied, that run within it. It is 
probably less platform-specific than a scheme that demands use of 
dynamically-linked / shared libraries.

The Unix shell and process model may be somewhat costly of computing 
resources (but only marginally so), especially as I said without 
copy-on-write behavior in the fork call, but that rather modest down-side 
is more than made up for by independence, modularity, and open-endedness of 
the scheme.

I can't see how all the work your idea implies just for the sake of some 
incremental performance improvements is going to be worthwhile.

By the way, which shell will you do this for? BASH, TCSH, Ash? More than one?

Please feel free to prove me wrong, of course.

Randall Schulz
Mountain View, CA USA


At 22:50 2002-03-18, Robert Collins wrote:
>Just a curiousity...
>
>I've a mental concept I've been batting around for a while - about how can 
>we drastically increase configure and related script performance on cygwin...
>
>AFAICT the largest performance issue is fork() and exec(). File access is 
>quite fast, as is networking. Unix sockets are a go slow given Ralf's 
>testing :p but that's about it.
>
>So, what I'm thinking could be done is:
>Create a new shell. For the most common current causes of fork()/exec(), 
>make those commands internal. Specifically, make all expression evaluation 
>(such as `basename foo`) done in-process (i.e. C-style 
>code:{save_context();evalute (expression);pop_context(result);}, only 
>spawning commands where they are not internal. (Currently, AFAIK, ash and 
>bash use sub-shells quite commonly).
>
>Now that would be a maintenance and coding nightmare - repeating lots of 
>other folk's work, and having to get bug compatability as well.... no thanks.
>
>What if, instead of rewriting all those helper commands, we
>*) Make each one into a library - ie cygshellbasename0.dll. - with a 
>well-defined interface (say execute (int argc, char **argv), AND no ABI 
>changes!
>*) Replace the current binary with a façade that uses the .dll.
>*) in the shell, look for the library *before* calling the binary, thus 
>saving a spawn()
>*) Ideally, adapt an existing shell rather than starting new (I'm not a 
>reinvent-ze-wheel) kinda guy.
>
>Now I imagine that if done _properly_ the upstream authors won't object 
>too much to librarization, so the amount of code to be written is 
>significantly shrunk.
>
>I've not seen a specific project to accomplish this (in 
>google/freshmeat/sourceforge) - but I figure that cygwin is _such_ a prime 
>platform for it that if one exists, and I'd be repeating work, I'll find 
>someone who knows it here....
>
>Anyway, this is (obviously) a long-term proposition, but if two or three 
>folk from here would be interested in collaborating on such a project...
>
>Cheers,
>Rob


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/