Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com> List-Archive: <http://sourceware.org/ml/cygwin/> List-Post: <mailto:cygwin AT cygwin DOT com> List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com X-Authentication-Warning: slinky.cs.nyu.edu: pechtcha owned process doing -bs Date: Tue, 25 May 2004 18:55:35 -0400 (EDT) From: Igor Pechtchanski <pechtcha AT cs DOT nyu DOT edu> Reply-To: cygwin AT cygwin DOT com To: Bruce Dobrin <dobrin AT imageworks DOT com> cc: cygwin AT cygwin DOT com Subject: Re: shell cmds crapping out with large numbers of files In-Reply-To: <018601c442a4$6c0197d0$4d1f1cac@THEODOLITE> Message-ID: <Pine.GSO.4.58.0405251810061.2742@slinky.cs.nyu.edu> References: <HDEFLKBDJBPHMOHGIIPCKECICAAA DOT greno AT verizon DOT net> <018601c442a4$6c0197d0$4d1f1cac AT THEODOLITE> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.39 On Tue, 25 May 2004, Bruce Dobrin wrote: > {uname -a > CYGWIN_NT-5.1 THEODOLITE 1.5.9(0.112/4/2) 2004-03-18 23:05 i686 unknown > unknown Cygwin > } > > I need to process very large numbers ( up to 100,000) of imagefiles. I > noticed my foreach loops start crapping out when the number of files grows > near 1500. It feels like a 32bit memory addressing problem to me, but I > don't know how to check for that. I wrote a foreach loop to generate files > (0 to xxxx) and then list them and it died at 1471 > > here is an example of the problem: > > dobrin AT THEODOLITE:/home/dobrin/longtest> ls flern* | wc > 1471 1471 32726 > dobrin AT THEODOLITE:/home/dobrin/longtest> touch flern0001471.plern.poo > dobrin AT THEODOLITE:/home/dobrin/longtest> ls flern* | wc > 2 [main] -tcsh 2396 cmalloc: cmalloc returned NULL > 0 0 0 > Segmentation fault (core dumped) > dobrin AT THEODOLITE:/home/dobrin/longtest> rm flern0001471.plern.poo > dobrin AT THEODOLITE:/home/dobrin/longtest> ls flern* | wc > 1471 1471 32726 ^^^^^ Cygwin has a 32k command-line length limit. That's what xargs was invented for. The proper message for this, however, is "Arg list too long", so the core dump is most likely due to a bug in tcsh glob expansion. It also looks specific to your system -- I can't reproduce[*] the crash on Win2k SP3 with the 20040520 snapshot (tcsh-6.12.00-7). Try fiddling with different programs to see which ones trigger it. 'Ware of aliases. It's also possible that a specific length of the glob expansion triggers it, although the exact numbers above worked for me. > [snip] > > I Currently am processing the files in batches of 1000 to avoid the problem. > I tried the same thing on my linux box and it works fine. > > Thankyou <http://cygwin.com/acronyms/#CYNUX>. POSIX allows command-line length limits, and Cygwin has one. Igor [*] Here's one of my attempts to reproduce it: $ echo $version tcsh 6.12.00 (Astron) 2002-07-23 (i386-intel-posix) options 8b,nls,dl,al,rh,color $ /bin/ls $ touch `seq 1 6770` $ /bin/ls * | wc 6770 6770 32743 $ touch 6771 $ /bin/ls * | wc /bin/ls: Arg list too long. 0 0 0 $ echo * | wc 1 6771 32748 $ touch `seq 6772 6776` $ echo * | wc 1 6776 32773 $ -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_ pechtcha AT cs DOT nyu DOT edu ZZZzz /,`.-'`' -. ;-;;,_ igor AT watson DOT ibm DOT com |,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D. '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! "I have since come to realize that being between your mentor and his route to the bathroom is a major career booster." -- Patrick Naughton -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/