delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2000/03/07/04:38:30

Date: Tue, 7 Mar 2000 10:49:51 +0200 (IST)
From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
X-Sender: eliz AT is
To: Alain Magloire <alain AT qnx DOT com>
cc: Nate Eldredge <neldredge AT hmc DOT edu>, djgpp-workers AT delorie DOT com
Subject: Re: DJGPP innovations ?????
In-Reply-To: <200003070250.VAA31684@qnx.com>
Message-ID: <Pine.SUN.3.91.1000307104850.21917A-100000@is>
MIME-Version: 1.0
Reply-To: djgpp-workers AT delorie DOT com
Errors-To: dj-admin AT delorie DOT com
X-Mailing-List: djgpp-workers AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

On Mon, 6 Mar 2000, Alain Magloire wrote:

> > Running a program in background is easy, but finding files that are
> > identical (efficiently) is not.
> 
> find $1 -xdev -type f -printf '%p %s\n' | \
>  sort -nk1 | tee candidates | \
>  uniq -f1 >uniquefiles && \
>  comm -3 candidates uniquefiles >redundant && \
>  join -1 2 -2 2 -o 2.1 1.1 redundant uniquefiles | xargs -n2 ln -f  

I'm probably missing something: the above doesn't seem to compare
file's contents, only their names and sizes, right?  If so, this is
not what I think was the intent: identical names and size does not
mean the files' contents are identical.  You need `cmp' somewhere in
that pipe.

When I said ``efficiently'', I thought about efficient comparison of
file contents that would avoid the quadratic behavior.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019