Mailing-List: contact cygwin-help AT sourceware DOT cygnus DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT sources DOT redhat DOT com Delivered-To: mailing list cygwin AT sources DOT redhat DOT com Message-ID: <17B78BDF120BD411B70100500422FC6309E1FE@IIS000> From: Bernard Dautrevaux To: "'DJ Delorie'" , jik-cygwin AT curl DOT com Cc: cygwin AT cygwin DOT com Subject: RE: Optimizing away "ReadFile" calls when Make calls stat() Date: Wed, 14 Feb 2001 11:23:42 +0100 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" > -----Original Message----- > From: DJ Delorie [mailto:dj AT delorie DOT com] > Sent: Tuesday, February 13, 2001 8:54 PM > To: jik-cygwin AT curl DOT com > Cc: cygwin AT cygwin DOT com > Subject: Re: Optimizing away "ReadFile" calls when Make calls stat() > > > > > As I've noted separately, reading tens of thousands of > files even once > > incurs a significant performance penalty. > > True, but reading them all once is better than reading them all twice. > I'm trying to break the problem down into small enough changes that we > actually have a chance of implementing them. > > > The change I've proposed can eliminate reading them at all. > > But not in a way that we can make it the default. Perhaps you could > propose a set of mount flags to optimize common situations? We > already have one to avoid the read-for-execute test, perhaps you could > work on an assume-no-symlinks flag? Then we wouldn't need a custom > make.exe (or any other program). > > > But it does nothing at all for the "usual case" I'm trying to > > optimize, which is Make stat()ing a file but never reading it. > > It does, because stat() reads the file twice, once to see if it's a > symlink, and once to see if the executable bit needs to be set. > > > > These should be easier wins (thus, more doable) than a > global cache, > > > which NT should be providing itself as part of the disk cache > > > subsystem (for local drives, at least). I don't think it's > > > appropriate for cygwin to go beyond this anyway - too many race > > > conditions arise. > > > > As far as I know, there are no race conditions in the change I > > suggested. In fact, it *removes* race conditions, since it reduces > > the number of distinct OS operations that must be performed > on a file > > during stat(). > > Right, but others were suggesting a global cache of file bytes. > *That* would introduce race conditions. > Perhaps a solution would be to maintain what could be called a "partial" stat() cache: maintain a global cache of ALL the result of the ReadFile()s (that can easily I think reduced to 1) together with the last-time-modified value. stat() will then ALWAYS check the last-time-modified of the ACTUAL file, then check the cache and if the cache is up-to-date, returns the execute/symlink flags found in the cache. If the cache is obsolete or absent, just re-read the file's content and save in the cache the LMT/exec/symlink values. The only race condition will be when UPDATING the cache (no problem on reading if we first change exec/symlink then upadte LMT); this should be simple to handle. Regretfully I don't have time to look at this (and don't know how it is effectively implemented now) but this should provide quite a big win for cygwin. Regards, Bernard -------------------------------------------- Bernard Dautrevaux Microprocess Ingenierie 97 bis, rue de Colombes 92400 COURBEVOIE FRANCE Tel: +33 (0) 1 47 68 80 80 Fax: +33 (0) 1 47 88 97 85 e-mail: dautrevaux AT microprocess DOT com b DOT dautrevaux AT usa DOT net -------------------------------------------- -- Want to unsubscribe from this list? Check out: http://cygwin.com/ml/#unsubscribe-simple