X-Spam-Check-By: sourceware.org Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: xargs gives grep/gawk too much Date: Wed, 13 Dec 2006 14:17:56 -0500 Message-ID: <31DDB7BE4BF41D4888D41709C476B657041694B6@NIHCESMLBX5.nih.gov> In-Reply-To: <20061210063339.GB15846@trixie.casa.cgf.cx> From: "Buchbinder, Barry \(NIH/NIAID\) [E]" To: X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id kBDJIC1B003172 Christopher Faylor wrote on Sunday, December 10, 2006 1:34 AM: > On Sun, Dec 10, 2006 at 01:21:19AM -0500, Christopher Faylor wrote: >> On Sat, Dec 09, 2006 at 11:31:05AM -0800, Karr, David wrote: >>> If the point of this note is to get your pipeline to work, would it >>> help if you added something like "-n 30" to the xargs command? This >>> should execute one instance of 'grep -l -e "error '1234567890'"' for >>> every 30 lines of output from the previous pipe entry. >>> >>> $ echo path/* | \ >>> tr ' ' '\n' | \ >>> grep -v -e '\*' | \ >>> xargs -r grep -l -e "error '1234567890'" | \ >>> rest_of_pipe >> >> I'm sure that the point of the note was to report a Cygwin bug. > > ...but I can't duplicate this with the latest snapshot... Sorry for my delayed follow-up, but between family duties and work ... Below is a test that looks at length of the command line and the number of arguments. I started out by looking for the exact line length that would cause problems. The line length that resulted well was under 32k so I started looking at the number of arguments. I initially came up with a number in the 5000s. It initially seemed to be consistent but after I took a break, did some things in other bash shells, and started up again (I think I was in the same shell, the number changed. As you can see below, I'm now getting errors with half as many arguments. (Different machines (laptop vs. desktop) but the setups should be very close, if not identical.) Especially interesting, is that as the line length gets longer, a number of arguments that had been giving error stops. Also, inserting tr ' ' '\0' between gawk and xargs and adding -0 to xargs shifted the point where the error start down one from 2868/2869 to 2867/2868. As for my scripts, changing the directory from which I was doing the echo path/* fixed things, though now that I've done these experiments, I may switch to using the -s or -n options of xargs. Thanks for the suggestions. And a big thanks in general to all the cygwin developers and package maintainers and in particular to whoever may decide to debug this (presuming that this is a problem with cygwin or one of its packages and that one of the volunteers decides to debug it). And if this is not a cygwin/package problem, I'd still appreciate any hints about what I might be doing wrong. - Barry === for X in 1 2 4 8 16 do for A in 2968 2969 do echo echo $X $A | \ gawk '{ for (N = 1; N <= $1; N++) { x = x "x" } x = " " x for (N = 1; N <= $2; N++) { o = o x } print "arg len: " $1 ", no args: " $2 ", tot len: " length(o) - 1 > "/dev/stderr" print substr(o,2) }' | \ xargs -r echo | \ wc | \ grep -v -e ' 0 * 0 * 0' done done arg len: 1, no args: 2968, tot len: 5935 1 2968 5936 arg len: 1, no args: 2969, tot len: 5937 xargs: echo: Argument list too long arg len: 2, no args: 2968, tot len: 8903 1 2968 8904 arg len: 2, no args: 2969, tot len: 8906 xargs: echo: Argument list too long arg len: 4, no args: 2968, tot len: 14839 1 2968 14840 arg len: 4, no args: 2969, tot len: 14844 xargs: echo: Argument list too long arg len: 8, no args: 2968, tot len: 26711 1 2968 26712 arg len: 8, no args: 2969, tot len: 26720 1 2969 26721 arg len: 16, no args: 2968, tot len: 50455 2 2968 50456 arg len: 16, no args: 2969, tot len: 50472 2 2969 50473 === Doing it again later in a different bash shell gave the following. So it still is changing with immediate conditions. === arg len: 1, no args: 2968, tot len: 5935 1 2968 5936 arg len: 1, no args: 2969, tot len: 5937 1 2969 5938 arg len: 2, no args: 2968, tot len: 8903 1 2968 8904 arg len: 2, no args: 2969, tot len: 8906 1 2969 8907 arg len: 4, no args: 2968, tot len: 14839 1 2968 14840 arg len: 4, no args: 2969, tot len: 14844 1 2969 14845 arg len: 8, no args: 2968, tot len: 26711 xargs: echo: Argument list too long arg len: 8, no args: 2969, tot len: 26720 xargs: echo: Argument list too long arg len: 16, no args: 2968, tot len: 50455 xargs: echo: Argument list too long arg len: 16, no args: 2969, tot len: 50472 xargs: echo: Argument list too long -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/