delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2008/11/23/09:15:38

X-Recipient: archive-cygwin AT delorie DOT com
X-Spam-Check-By: sourceware.org
X-Authority-Analysis: v=1.0 c=1 a=8e6AR2xg2moA:10 a=kwy4nkr_1WEA:10 a=xe8BsctaAAAA:8 a=u_GAltdbkJRXT_7bVAgA:9 a=Cxqw9LKRo7lO3a5SYnIA:7 a=dHExqALjuPBSSzKPT-fQsKx_w60A:4 a=eDFNAWYWrCwA:10 a=rPt6xJ-oxjAA:10
Message-ID: <49296551.4010801@byu.net>
Date: Sun, 23 Nov 2008 07:14:41 -0700
From: Eric Blake <ebb9 AT byu DOT net>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.18) Gecko/20081105 Thunderbird/2.0.0.18 Mnenhy/0.7.5.666
MIME-Version: 1.0
To: cygwin AT cygwin DOT com, bug-coreutils <bug-coreutils AT gnu DOT org>
Subject: Re: "du -b --files0-from=-" running out of memory
References: <nacii4p76633jbufvfoj4qjesrph05rjga AT 4ax DOT com>
In-Reply-To: <nacii4p76633jbufvfoj4qjesrph05rjga@4ax.com>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[adding the upstream coreutils list]

According to Barry Kelly on 11/23/2008 6:24 AM:
> I have a problem with du running out of memory.
> 
> I'm feeding it a list of null-separated file names via standard input,
> to a command-line that looks like:
> 
>   du -b --files0-from=-
> 
> The problem is that when du is run in this way, it leaks memory like a
> sieve. I feed it about 4.7 million paths but eventually it falls over as
> it hits the 32-bit address space limit.

That's because du must keep track of which files it has visited, so that
it can determine whether to recount or ignore hard links that visit a file
already seen.  The upstream ls source code was recently change to store
this information only for command line arguments, rather than every file
visited; I wonder if a similar change for du would make sense.

> 
> Now, I can understand why a du -c might want to exclude excess hard
> links to files, but that at most requires a hash table for device &
> inode pairs - it's hard to see why 4.7 million entries would cause OOM -
> and in any case, I'm not asking for a grand total.
> 
> Is there any other alternative to running e.g. xargs -0 du -b, possibly
> with a high -n <arg> to xargs to limit memory leakage?
> 
> -- Barry
> 

- --
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9 AT byu DOT net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkkpZVEACgkQ84KuGfSFAYAFDwCfUXyduR1FfsDNn/RhzYAmy9lH
issAn0TQPQ0gQ6UKTkei1jDtVnxiQD5c
=0H69
-----END PGP SIGNATURE-----

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019