delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2005/06/04/20:55:22

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Date: Sat, 4 Jun 2005 20:55:08 -0400
From: Christopher Faylor <cgf-no-personal-reply-please AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Performance problems
Message-ID: <20050605005508.GA2706@trixie.casa.cgf.cx>
Reply-To: cygwin AT cygwin DOT com
References: <4297A14B DOT 9070409 AT plausible DOT org> <20050528131501 DOT V53507 AT logout DOT sh DOT cvut DOT cz> <20050528160424 DOT GB12395 AT trixie DOT casa DOT cgf DOT cx> <429ED094 DOT 9080001 AT tlinx DOT org> <Pine DOT GSO DOT 4 DOT 61 DOT 0506021301320 DOT 10282 AT slinky DOT cs DOT nyu DOT edu> <20050602172226 DOT GC6597 AT trixie DOT casa DOT cgf DOT cx> <42A2246D DOT 3090000 AT tlinx DOT org>
Mime-Version: 1.0
In-Reply-To: <42A2246D.3090000@tlinx.org>
User-Agent: Mutt/1.5.8i

On Sat, Jun 04, 2005 at 03:00:13PM -0700, Linda W wrote:
>Christopher Faylor wrote:
>>On Thu, 2 Jun 2005, Linda W wrote:
>>>In tracing the Win32 file operations, find seems to perform multiple
>>>file open operations for each file processed.  One way to speed up
>>>operations in this area might be to keep a "cache" of the last "N" file
>>>handles.  I suspect it's just the Windows path lookup mechanism being
>>>slow to reopen things.  But if the cygwin.dll could cache even the past
>>>5 entries, it might speed things up significantly.  If it is opened
>>>each time to read different information, it might be much cheaper to
>>>collect all the information at one time and cache it in an internal
>>>"inode cache" that could expire in a second or so.  If it would "slow"
>>>down other programs, it could have some smarts in the system calls to
>>>look for calling patterns from programs like find that need a couple or
>>>more openings to fully "process a file", that all happen within a few
>>>milliseconds of each other.
>>>
>>Oddly enough, Corinna and I have been discussing the possibility of
>>caching opendir/readdir data for subsequent use in stat().  She's for it
>>and I'm mildly agin' it.
>> 
>>I think that introducing caching opens the door to all sorts of subtle
>>race conditions since only the OS can maintain cache coherency.
>
>You are technically accurate, but the cygwin layer is a POSIX
>complient-OS emulation layer by some definition, no?

Yes, but that has nothing to do with caching.  Cygwin is just a DLL.  It
can't monitor all file transactions in the whole system.

>I wouldn't cache data without keeping the associated handles to the
>corresponding file objects open.  As long as they are kept open,
>Windows would disallow things like deleting the file and replacing
>it with a directory.  That should control most race conditions
>with some degree of relative safety.

You can't do that without taking the fact that the handle is open into
account when cygwin itself removes a file, opens a file, renames a file.
And it could be pretty surprising to find that when process a does an
opendir/readdir, process b is now unable to delete a file.

>>She thinks that the benefits would outweigh the tiny possibility of bad
>>cache data resulting from something like performing an "ls" on a file
>>and having, e.g., some other process sneak in, remove the file and
>>introduce a directory, but still having "ls" report file data.
>
>Isn't this already a problem on networked shares?  I.e.  doesn't
>Windows cache file info from network shares for a few seconds (maybe
>more if one has local-file caching turned on).

I don't know but, regardless, this would increase the possibility for
surprise to include local disks too.  I'm not convinced that this is a
good thing.  This would make the behavior that Gary R.  Van Sickle
recently reported as the result of using google search (I think it was
google search), where files were kept open even though it seems like
they should be closed, common with cygwin.

>However, you spend time writing how no one _ever_ investigates
>performance problems or suggests solutions.  That appears to be a
>cynical view.  Then, when offered a clear example to the contrary, you
>discard the effort as being "unoriginal" and already something that has
>been (and is being) considered independantly of their suggestion.
>
>That \could\ be perceived, by some, as "mean-spirited" or "spiteful".
>I don't feel that this _encourages_ people to take the time to actually
>"figure out" problems nor "figure out" improvements.  If they don't
>know you, some people might take it personally.  :-) (Not that you
>would be expected to care, publically :-) ).

You seem to be affronted by something that I said before you even
responded.  I did not respond to your email with a "you didn't even look
at the code" response.  I did not say "you are unoriginal".  I merely
represented our current thinking about the subject that you raised.

I happen to know that Corinna isn't around so I wanted to make sure that
she got the credit for having been thinking about this and even going so
far as to start coding something, I believe.  We have been talking about
caching for a long, long time.  I believe that there is even an "#if 0"
or two in the cygwin code still which contains my aborted attempt to
cache some path_conv lookups.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019