Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Date: Thu, 15 Sep 2005 10:45:10 -0400 (EDT) From: Igor Pechtchanski Reply-To: cygwin AT cygwin DOT com To: Jan Schormann cc: cygwin AT cygwin DOT com Subject: RE: Cygwin build system SOOOO SLOOOWWWW ??? In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII On Thu, 15 Sep 2005, Jan Schormann wrote: > Let's see ... > > > 1) How can I tell what Cygwin is doing? Is there a tool that will > > tell me what tool is actually running at any given time? Is there > > any way to tell what Cygwin is doing down in its guts? Does anyone > > have any other suggestions as to how I might get to the > > bottom of this? > > Below, I'll tell about some suspicions I have about what cygwin might > actually be doing. To your question, I can offer two Ideas: > > - "top" or any Windoze Process Explorer more sophisticated than > the task manager > - "strace" - though I haven't ever used it, but from what I know > this will definitely give you an answer - maybe two much of it ;-) You can give strace command-line options to show only the kinds of events you want... See the strace help (or ). > > 2) Has anyone else experienced speed problems with Cygwin? Has > > anybody else felt that Cygwin has gotten slower over the last > > year or so? Are there any guidelines or "tricks" for getting > > Cygwin to run faster? > > a) Forking is more expensive in Windoze. > On Unix, especially in make environments, you'll often start new > processes as you're going - and often you'll not even notice. Google > for "bash tricks" on how to fork less often. Forking is not as expensive in Windows as it is in Cygwin (especially if you fork off a Windows process, since Cygwin creates a stub for that). . > Hint: Don't use "sed" in `backticks` just for simple string > replacements. > Much of this can be done in make or bash directly. > Look at the changes you made - maybe you thought it's more elegant? FWIW, much of this can be done directly in "make". :-) > b) This is especially true for shells. > I'm not really sure on when and where this hits, but under certain > circumstances, bash needs to parse /etc/passwd when it starts. Do > you create /etc/passwd from an LDAP directory using mkpasswd? *bash* itself never parses /etc/passwd. Cygwin does -- every Cygwin process looks at /etc/passwd on startup. The first Cygwin process actually reads it, and the rest simply check whether it changed. However, that's just a file stat -- it doesn't actually query the domain or LDAP directory (at least after the first invocation -- it does query the current user then, but I don't think it does that for all users). > Maybe you have hired some more people last year and it got longer? > Hint: Try whether it makes a difference if you replace /etc/passwd > with one that contains only the local users (look at the options for > mkpasswd). This shouldn't make a difference for multiple forks. > c) /bin/sh is now bash, which is now dynamically linked. > Up until a few months ago, /bin/sh has been "ash", a smaller, but > less powerfull shell. This has been replaced by bash, to reduce the > traffic of repeated questions along the lines of "why does my shell > act different than on linux" (where /bin/sh is bash on most > distributions). > If I understood the traffic on this list correctly, bash is now > dynamically linked, which might have an impact on starting it - I can't > tell. It shouldn't. The DLLs are in memory, so any subsequent invocation of bash will load the cached versions (Windows does that automatically). > Hint: Don't start bash so often. Create fewer processes, but if you > must, see if you gain by using ash explicitely instead of bash. > > To the gurus - is the following correct? > `echo blub` starts one process, `echo blub | sed -e 's/b/x/g'` > starts three: "echo", "sed", and "bash" to implement the pipe. I'm far from a guru, but let me take a shot at answering this: If you're talking about running those from bash, then "echo blub" doesn't start *any* processes -- "echo" is a bash builtin. If you're asking about make, make will start /bin/sh to execute the "echo" command. "echo blub | sed -e ..." will start 1 process from bash, and 2 from make (the 1 extra process is "sed" -- no process is created for the pipe). FYI, "BLAH=blub; echo ${BLAH//b/x}" will not spawn *any* processes when run directly from bash. > d) Beware of lazy evaluation. > Look at this construct: > CFLAGS=$(shell find . -type d -name include) > Read "info make" on setting variables and find out about the > difference between "=" and ":=". The above will run the find > again for every single call to the compiler. Along with the > issues about forking and reading directories and small files, > this can make a difference of *ages*. > Hint: See whether you can use less variables, use ":=" more often, > etc. - and don't use "$(shell ...)" anyway, as stated in a). > Rather, pre-compute makefiles with all the data hardcoded, using > ":=". That's sound advice. > e) Reading lots of small files seems more expensive on Windoze. > I don't know about your Makefiles, but traditionally, makefiles are > spread across project directories (for build hierarchies), and > makedepend creates even more of that. For one of our applications, I > roughly calculated that make needs to open, read, and parse well over a > thousand files (not counting the source or objects or any such thing, > just the makefiles), just for telling you that all the targets are up to > date. > Hint: Phew ... > > You see, for our configurations, running make to tell me that *nothing* > has changed could take up to half an hour. Therefore we introduced some > magic using Python to generate and split up makefiles two years ago, and > were down below five minutes again. If you're using make recursively, google on the evils of recursive make. If not, please disregard this. > This is nothing compared to the link time of well over 15 minutes, so > we started to convert to DLLs for development (released applications > are still supposed to be linked statically, as they only run on > dedicated machines). We're currently trying to replace the whole build > chain by a single daemon written in a decent language - hoping (i) that > we need only one process for the actual rule system etc., and will only > start additional processes for the compiler and linker; and hoping (ii) > that the actual rule set will be much easier to debug. (You know, > developers come to me and say "but I've only touched this little cpp and > now everything's getting compiled again and ..." - how do I know what > really happened?) > > > Thanks in advance for any feedback that might help me speed up my > > builds. > > Let's see whether my hints are any good, but you're welcome anyway :-) HTH, Igor -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_ pechtcha AT cs DOT nyu DOT edu ZZZzz /,`.-'`' -. ;-;;,_ igor AT watson DOT ibm DOT com |,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D. '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! If there's any real truth it's that the entire multidimensional infinity of the Universe is almost certainly being run by a bunch of maniacs. /DA -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/