From: skb AT xmission DOT removethis DOT com (Scott Brown) Newsgroups: comp.os.msdos.djgpp Subject: Re: Reading directories, readdir/stat too slow Date: Thu, 30 Sep 1999 22:34:47 GMT Organization: (none) Lines: 56 Message-ID: <37f3c4e1.1015552570@news.xmission.com> References: <37f307e1 DOT 967161774 AT news DOT xmission DOT com> NNTP-Posting-Host: slc1140.modem.xmission.com X-Trace: news.xmission.com 938730851 1266 166.70.8.124 (30 Sep 1999 22:34:11 GMT) X-Complaints-To: abuse AT xmission DOT com NNTP-Posting-Date: 30 Sep 1999 22:34:11 GMT X-Newsreader: Forte Free Agent 1.11/32.235 To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com On Thu, 30 Sep 1999 16:17:21 +0200, Eli Zaretskii wrote: > >On Thu, 30 Sep 1999, Scott Brown wrote: > >> Unfortunately, I find that opendir/readdir, when combined with a stat >> call (to get file mode/size/etc) is a *lot* slower than findfirst, >> taking on the order of 70-100 times longer to perform the same work. >> When running against tens of thousands of files in hundreds of >> directories, it is a significant problem. > >stat is expensive (it isn't easy to get all that info on DOS; you won't >believe how closely DOS guards some of its dirty secrets ;-). But 100 >times slower seems to be too much; I suspect your system is not set up in >an optimal way. See section 3.9 of the FAQ; in particular, make sure you >have a disk cache installed. Well, I'm running under Windows 95, which has a built-in disk cache, and my hardware is plenty powerful for this kind of work (K6-200 w/64Mb). Statistics from sysmon are limited, but it shows that my disk cache hasn't dropped below 2.5Mb in the last little while. >A 10-fold slow-down when using stat is >something I would expect, but not 100-fold. It seemed pretty ridiculous to me as well. I put together a simple timed test and ran it against a set of about 30,000 files in 350 directories; I ran each test twice and took the second result, to give the OS a fair chance to load the cache. The findfirst test finished in 1.98 seconds, while the readdir/stat test finished in a whopping 133.96 seconds. I believe my test code is fair, but second opinions are welcome; grab a copy here: ftp://ftp.xmission.com/pub/users/s/skb/pub/dirtest.c >Read the docs ;-). That always helps... >No, seriously: the documentation of _djstat_flags in libc.info describes >several flags that can be set to disable computing some expensive >members of struct stat for which you don't have any use. For the >fastest operation, you should disable all features but those which your >application needs. Doing so is known to speed up stat tremendously. I only need the mode, and the size and timestamp for files. After I R'd TFM, I tried setting *all* of the _STAT... bits, but the results were disappointing; certainly not what I'd term a "trememdous" improvement. Instead of taking 133.96 seconds to finish, the test took only 125.44 seconds. My test program is fairly representative of the kind of directory traversal code I use in my applications. Could there be something else that I am overlooking?