delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2008/11/24/17:54:40

X-Recipient: archive-cygwin AT delorie DOT com
X-Spam-Check-By: sourceware.org
X-IronPortListener: Outbound_SMTP
MIME-Version: 1.0
Subject: RE: Re: find . -regex '.*js' -type f -exec md5sum '{}' \\; really slow!
Date: Mon, 24 Nov 2008 17:53:38 -0500
Message-ID: <31DDB7BE4BF41D4888D41709C476B6570C2AAE56@NIHCESMLBX5.nih.gov>
In-Reply-To: <ggep9e$f5l$1@ger.gmane.org>
References: <1227540449 DOT 7201 DOT 45 DOT camel AT LxPC35> <F0D7281DAB048B438E8F5EC4ECEFBDDC0337DE68 AT esmail DOT elsag DOT de> <1227542582 DOT 7201 DOT 51 DOT camel AT LxPC35> <1227542941 DOT 7201 DOT 55 DOT camel AT LxPC35> <ggep9e$f5l$1 AT ger DOT gmane DOT org>
From: "Buchbinder, Barry (NIH/NIAID) [E]" <BBuchbinder AT niaid DOT nih DOT gov>
To: <cygwin AT cygwin DOT com>, "Matthew Woehlke" <mw_triad AT users DOT sourceforge DOT net>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id mAOMsdgO000464

The following may technically be off-topic.  If so, I apologize.

Matthew Woehlke wrote on Monday, November 24, 2008 12:46 PM:
> Bartolomeo Nicolotti wrote:
>> but the command
>> 
>> find . -type f | xargs md5sum
>> 
>> has problems with blanks in the name of the files:
>> [snip examples]
> 
> find . -type f -print0 | xargs -0 md5sum

I've found that find is significantly slower than native tools.  (The following was run several times to fill any cached file system data.)

    local hard disk (C:):

	> time "$(cygpath -u "${COMSPEC}")" /c dir /s /b /a:-d | wc
	  16085   16308  690388

	real    0m0.343s
	user    0m0.122s
	sys     0m0.170s

    networked drive:

	> time "$(cygpath -u "${COMSPEC}")" /c dir /s /b /a:-d | wc
	   1183    3093   66761

	real    0m3.078s
	user    0m0.075s
	sys     0m0.108s

	> time find . -type f | wc      ")" /c d
	   1183    3093   53748

	real    1m0.813s
	user    0m0.216s
	sys     0m8.046s

Therefore, you might consider using something like this if there are no symbolic links* and it doesn't offend your sensibilities.  (* and other "oddities".  I'm not sure how symbolic links work with find . -type f, so this might not be a problem.)

"$(cygpath -u "${COMSPEC}")" /c dir /s /b /a:-d | \
	tr -s '\r\n' '\n' | \
	cygpath -u -f - | \
	tr '\n' '\0' | \
	xargs -r0 md5sum

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019