delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1999/10/06/16:36:07

Date: Wed, 6 Oct 1999 16:15:49 +0200 (IST)
From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
X-Sender: eliz AT is
To: Charles Sandmann <sandmann AT clio DOT rice DOT edu>
cc: djgpp AT delorie DOT com
Subject: Re: Reading directories, readdir/stat too slow
In-Reply-To: <9910051657.AA15777@clio.rice.edu>
Message-ID: <Pine.SUN.3.91.991006161523.13916L-100000@is>
MIME-Version: 1.0
Reply-To: djgpp AT delorie DOT com
X-Mailing-List: djgpp AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

On Tue, 5 Oct 1999, Charles Sandmann wrote:

> > Won't this break if the file changes on disk between the call to
> > readdir and the call to stat?  
> 
> I used a single file element cache, and invalidated it on any other
> dpmi_int call.

If you invalidate the cache on any __dpmi_int, I'm afraid you will not
gain much.  After all, I'd expect that a program that walks a
directory actually *does* something with the files it finds, and most
file-oriented operations call DOS through __dpmi_int.  Heck, stat
itself calls __dpmi_int quite a few times before it gets to calling
findfirst (which is where the cache gets used).

> I wasn't trying to protect against all the possible cases.  But I did
> need a 50X speedup.  It worked on all code except unix ports that assumed
> a unix file system (inodes, etc).

Right.  So these solutions are probably good for highly-specialized
versions of stat that are closely coupled with specific needs of a
single application.  They cannot be implemented in libc.a.

> A lot of the stuff which causes readdir/stat to be slow isn't needed 
> for the typical application

I think this depends on the definition of ``the typical
application''.  My idea seems to be different from yours in this
case ;-).  Perhaps a DOS-based program which doesn't expect much from
the filesystem information indeed doesn't need 90% of what stat
generates, but programs of Unix origin do need most of it.

I will try to add more bits to _djstat_flags, so that, for example,
calls to mktime could be replaced with a simple computation that
assumes GMT.  Lamentably, most of the time in stat is spent calling
findfirst (once you turn off the other heavy code with _djstat_flags),
and that cannot be worked around in a general-purpose version.

> - if you look at the minimal calls to do
> readdir/stat for portable structure entries (no inodes, etc) you get
> this from the single findfirst/findnext call - and thus almost identical
> speed.

One problem is that findfirst is about 10 times more expensive than
findnext.  Since stat cannot call the latter, you lose by a factor of
10 before you even begin optimizing.

Let's face it: the Unix filesystem has one system call to read a
directory, and another to get the file info, whereas DOS/Windows
implement both in a single call.  These two models are just too
incompatible to allow portable programs to run at native speed.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019