delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2005/06/05/23:47:05

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Date: Sun, 5 Jun 2005 23:46:52 -0400
From: Christopher Faylor <cgf-no-personal-reply-please AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: Performance problems
Message-ID: <20050606034652.GB9161@trixie.casa.cgf.cx>
Reply-To: cygwin AT cygwin DOT com
References: <4297A14B DOT 9070409 AT plausible DOT org> <20050528131501 DOT V53507 AT logout DOT sh DOT cvut DOT cz> <20050528160424 DOT GB12395 AT trixie DOT casa DOT cgf DOT cx> <429ED094 DOT 9080001 AT tlinx DOT org> <Pine DOT GSO DOT 4 DOT 61 DOT 0506021301320 DOT 10282 AT slinky DOT cs DOT nyu DOT edu> <20050602172226 DOT GC6597 AT trixie DOT casa DOT cgf DOT cx> <42A2246D DOT 3090000 AT tlinx DOT org> <20050605005508 DOT GA2706 AT trixie DOT casa DOT cgf DOT cx> <42A3BC5C DOT 1090605 AT tlinx DOT org>
Mime-Version: 1.0
In-Reply-To: <42A3BC5C.1090605@tlinx.org>
User-Agent: Mutt/1.5.8i

On Sun, Jun 05, 2005 at 08:00:44PM -0700, Linda W wrote:
>Christopher Faylor wrote:
>>On Sat, Jun 04, 2005 at 03:00:13PM -0700, Linda W wrote:
>>>You are technically accurate, but the cygwin layer is a POSIX
>>>complient-OS emulation layer by some definition, no?
>>
>>Yes, but that has nothing to do with caching.  Cygwin is just a DLL.  It
>>can't monitor all file transactions in the whole system.
>
>True, but cygwin doesn't need to monitor the entire OS -- neither
>does Windows. Take a look at the open file descriptors held by
>the winlogon process sometime -- it holds open OS-specific
>directories and files.

I am talking about cache coherency.  The OS can maintain it because it
knows what files are being updated.  Cygwin can't.  If cygwin opens a
file and another unrelated process modifies it, cygwin's cached
information would be wrong.  This is a simple statement of fact.

>Cygwin would only need to "cache" items (in the sense I would
>anticipate) while the DLL is loaded and only those file items
>that are being used by the current program.  For example a simple
>find command on /tmp "find /tmp" produces 17 lines:
>/tmp
>/tmp/d.txt
>/tmp/run-crons.ZE1996
>/tmp/run-crons.ZE1996/run-crons.1924
>/tmp/run-crons.ZE1996/run-crons.daily.1924
>/tmp/588-reg.reg
>/tmp/1892-reg.reg
>/tmp/VolumeC.txt
>/tmp/xyz.txt
>/tmp/wd.txt
>/tmp/d1.txt
>/tmp/xyz.txt.orig
>/tmp/AUTORUN.INF
>/tmp/WD_Data.ICO
>/tmp/WD_Install.exe
>/tmp/img1
>/tmp/1
>============
>In all there were 311 file operations to list these 17 files.
>They break down as folows:
>1-27 - finding program by bash
>28-48 - loading libraries
>49-75 - processing "C:\, C:\home and C:\home\username
>76-243 - working on tmp
>244-311 - accessing home directory; search for psapi.dll & close of /tmp
>
>The ones working on tmp were broken down as follows:
>
>The first 27 were processing by bash to find "find.exe". Ignore.
>Commands up to 28-48 were loading cygwin libraries by the find
>command; Ignore that.
>Commands 49-75 Involved file ops (Open, Query Info, Directory on the
>paths C:\, C:\home\ and C:\home\user).   Calls 76-243 seem to be working
>on /tmp, calls.  The tmp calls (executing between time index 51.995 - 
>51.005 (<1 clock tick), show the following breakdown:
>
>     1 C:\home\law, QUERY INFORMATION
>     1 C:\tmp\d.txt, READ
>     2 C:\home\law, CLOSE
>     2 C:\home\law, OPEN
>     2 C:\tmp\d.txt, CLOSE
>     2 C:\tmp\d.txt, OPEN
>     3 C:\tmp\d.txt, QUERY INFORMATION
>     5 C:\tmp\run-crons.ZE1996\, CLOSE
>     5 C:\tmp\run-crons.ZE1996\, OPEN
>     6 C:\tmp\run-crons.ZE1996, QUERY INFORMATION
>     7 C:\, CLOSE
>     7 C:\, DIRECTORY
>     7 C:\, OPEN
>     8 C:\tmp\run-crons.ZE1996, CLOSE
>     8 C:\tmp\run-crons.ZE1996, OPEN
>    10 C:\tmp, QUERY INFORMATION
>    12 C:\tmp\, CLOSE
>    12 C:\tmp\, OPEN
>    13 C:\tmp, CLOSE
>    13 C:\tmp, OPEN
>    15 C:\tmp\run-crons.ZE1996\, DIRECTORY
>    28 C:\tmp\, DIRECTORY
>
>So if I was wanting to cache -- say limit caching to ~.1-1 seconds,

<And as soon as you start timing out your cache, you either have a
separate thread running which manages this (which implies careful
attention to locking issues and context switching) or you a schedule
  timer signal (which has similar problems).)

>it would appear, on the surface, to possibly reduce the 169 calls to
>maybe 22?

You really can't predict without looking at the code.  There is no way
of knowing what the above information represents as far as what cygwin
and find are doing.

>>You can't do that without taking the fact that the handle is open into
>>account when cygwin itself removes a file, opens a file, renames a file.
>> 
>You can't? It would seem the cygwin library, itself could maintain
>it's own list of open descriptors and close them when needed.  Doesn't
>cygwin use a shared-memory region for interprocess communication? 
>Couldn't this same region be used for the File-handle/info cache so
>multiple cygwin processes would behave with each other?

I was filling in the details here just to show that the solution of
keeping files open has consequences.  Keeping the file open increases
the complexity of every function which manipulates a file rather than
the one or two functions which might be interested in the cached status
information.

>>And it could be pretty surprising to find that when process a does an
>>opendir/readdir, process b is now unable to delete a file.
>> 
>I'm not 100% certain, but I believe having a file (or dir) open
>for read doesn't mean someone can't change the contents.  They
>just can't delete the dir or file that is still opened for reading.

Which is why I said "unable to delete a file".

>This is already a problem even w/o caching.  Cygwin can't delete
>various directories because they are kept open by the login shell.

Being unable to consistently delete a file because something has it open
is explainable.  What isn't explainable is "Why does my configure script
work some times but not others?"  When you talk about keeping caching information
around, you stand the chance of something like this not working:

  find . -name foo | xargs rm

because find may still have foo open when rm tries to remove it.

That may not be a huge deal for a FS/OS which honors DELETE_ON_CLOSE and
will be able to delete "foo" regardless, but this introduces another
potential place for code complexity.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019