delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2018/09/30/15:50:52

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:message-id:date
:mime-version:in-reply-to:content-type
:content-transfer-encoding; q=dns; s=default; b=FgLyRKvz7UzrRba5
EafuknsB9DoqyQnFBLI0LueAGnQ6PP9fryRvX55rxAWpc0FzUc6i4PPKEpZ6g5qW
EXTNSLmiMTbQwG3k3iRU3FXO0WFEVjEOksvwjiv6fbNGXcpvWgT6W63uyYbIeY4S
h/hGDGo07f/9SuVhD3hE0WuJqNQ=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:message-id:date
:mime-version:in-reply-to:content-type
:content-transfer-encoding; s=default; bh=x9013ZOuC7W088VBLwEAEG
SmrZ0=; b=dXemmhvKC+cQoChia5RlEdxJTFVbRrDP2diF8veOtmFaWBo9v66g+k
+rwaliBwy0hcWLJYVUWnlKlBvxVtPBpk4uqV03m/Rw1kH3+5GQ/s3ub0lra6g2Ws
jB0Lq+IhcRBlJ4Zrxadu4bhfDb9y+VXFgweS9wjlS45yf/JmeOQVQ=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=doubt, Problem, scanner, Did
X-HELO: smtp01.udag.de
Subject: Re: Filesystem enumeration performance improvement
To: cygwin AT cygwin DOT com, marco DOT mason AT gmail DOT com
References: <CANNqMjAGEm64Z4ULhbe4KtcmT1Y7njOYoJCG6V_KbrFuj1dj=A AT mail DOT gmail DOT com>
From: =?UTF-8?Q?J=c3=bcrgen_Wagner?= <juergen AT wagner DOT is>
Message-ID: <c02d78bc-3b97-35f3-e2b6-5811d38b188c@wagner.is>
Date: Sun, 30 Sep 2018 21:50:36 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <CANNqMjAGEm64Z4ULhbe4KtcmT1Y7njOYoJCG6V_KbrFuj1dj=A@mail.gmail.com>
X-IsSubscribed: yes
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id w8UJopZA008272

Hi Marco,
  as you don't use the Cygwin APIs but go to the Windows APIs directly,
any changes to the way stat()/readdir() or related functions in Cygwin
operate do not seem to be a plausible reason why your code is running
faster. I doubt printf() can be improved to provide such a dramatic
speed-up.

In my experience, such effects usually have one of two reasons:

- There is some caching involved, either in Windows or on the disk
level. Run the benchmark tests with empty caches or caching disabled.

- Your virus scanner has improved and the operation of determining the
status of files no longer excessively causes checks. This is a bit
harder to verify or test.

Did you compare your program's performance with that of Cygwin's "find"?
Did that also show such a dramatic increase in throughput?
There is a free and quite fast disk space analyzer called RidNacs
(ScanDisk backwards). If the magic you observe is an optimized way of
caching, this program should also be affected.



Cheers,
--J.


On 30.09.2018 20:41, Marco Mason wrote:
> I recently upgraded from cygwin v2.10 to v2.11.1 and noticed that one of my
> programs got a tremendous speed boost.  It's a custom filesystem
> enumeration program whose output I feed to frcode to update the
> /var/locatedb database.  It used to take quite a bit of time (15-20
> minutes?), and now runs in about a minute.  Since the program seems to work
> well, just many times faster, I'm rather happy with the changes.
>
> The reason I'm writing is that I don't see *why* I should have any timing
> changes at all!  The reason I have my own file enumerator for locatedb is
> that the original went through the POSIX layer and was pretty slow,
> especially for remote-mounts.  As I only needed enough for locate, I wrote
> my own enumerator against the Windows API for speed.  Since my loop is
> essentially just using FindFirstFile/FindNextFile and printf(), I don't
> know why file gathering would be any faster.
>
> So either printf() has gotten remarkably faster, or there are some
> interactions between Cygwin and windows in the file enumeration area that
> are surprising me.  Can someone please clue me in to what might be causing
> the speed increases?
>
> Looking at the git log and mailing list history, my best guess would be
> that it's related to the EMail threads  "Why does readdir() open files ?"
> (Ben Rubson 2018-03-28) and "Why does (stat() ?) open files ?" (Ben Rubson
> 2018-04-09).  However, I can't seem to pin down which git commits are
> relevent to those threads.  If anyone can provide a little insight, I'd
> really appreciate it.
>
> --marco
>
> --
> Problem reports:       http://cygwin.com/problems.html
> FAQ:                   http://cygwin.com/faq/
> Documentation:         http://cygwin.com/docs.html
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
>
>



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019