X-Recipient: archive-cygwin@delorie.com
X-Spam-Check-By: sourceware.org
X-IronPortListener: Outbound_SMTP
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; 	charset="iso-8859-1"
Subject: RE:  Re: find . -regex '.*js' -type f -exec md5sum '{}' \\; really slow!
Date: Mon, 24 Nov 2008 17:53:38 -0500
Message-ID: <31DDB7BE4BF41D4888D41709C476B6570C2AAE56@NIHCESMLBX5.nih.gov>
In-Reply-To: <ggep9e$f5l$1@ger.gmane.org>
References:  <1227540449.7201.45.camel@LxPC35> 	 <F0D7281DAB048B438E8F5EC4ECEFBDDC0337DE68@esmail.elsag.de> 	 <1227542582.7201.51.camel@LxPC35> <1227542941.7201.55.camel@LxPC35> <ggep9e$f5l$1@ger.gmane.org>
From: "Buchbinder, Barry (NIH/NIAID) [E]" <BBuchbinder@niaid.nih.gov>
To: <cygwin@cygwin.com>, "Matthew Woehlke" <mw_triad@users.sourceforge.net>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
Precedence: bulk
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie.com@cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id mAOMsdgO000464

The following may technically be off-topic.  If so, I apologize.

Matthew Woehlke wrote on Monday, November 24, 2008 12:46 PM:
> Bartolomeo Nicolotti wrote:
>> but the command
>> 
>> find . -type f | xargs md5sum
>> 
>> has problems with blanks in the name of the files:
>> [snip examples]
> 
> find . -type f -print0 | xargs -0 md5sum

I've found that find is significantly slower than native tools.  (The following was run several times to fill any cached file system data.)

    local hard disk (C:):

	> time "$(cygpath -u "${COMSPEC}")" /c dir /s /b /a:-d | wc
	  16085   16308  690388

	real    0m0.343s
	user    0m0.122s
	sys     0m0.170s

    networked drive:

	> time "$(cygpath -u "${COMSPEC}")" /c dir /s /b /a:-d | wc
	   1183    3093   66761

	real    0m3.078s
	user    0m0.075s
	sys     0m0.108s

	> time find . -type f | wc      ")" /c d
	   1183    3093   53748

	real    1m0.813s
	user    0m0.216s
	sys     0m8.046s

Therefore, you might consider using something like this if there are no symbolic links* and it doesn't offend your sensibilities.  (* and other "oddities".  I'm not sure how symbolic links work with find . -type f, so this might not be a problem.)

"$(cygpath -u "${COMSPEC}")" /c dir /s /b /a:-d | \
	tr -s '\r\n' '\n' | \
	cygpath -u -f - | \
	tr '\n' '\0' | \
	xargs -r0 md5sum

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


