delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/11/07/17:27:30

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00
X-Spam-Check-By: sourceware.org
Subject: Re: 1.7] BUG - GREP slows to a crawl with large number of matches on a single file
From: Richard Foulk <richard AT skydive1 DOT com>
To: cygwin AT cygwin DOT com
In-Reply-To: <806a89db0911061422l290ff84u3d58cbbe1d3eface@mail.gmail.com>
References: <806a89db0911061422l290ff84u3d58cbbe1d3eface AT mail DOT gmail DOT com>
Date: Sat, 07 Nov 2009 12:27:11 -1000
Message-Id: <1257632832.5773.48.camel@fast>
Mime-Version: 1.0
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

Jim Reisert wrote:

>On Fri, Nov 6, 2009 at 7:12 AM, Cooper, Karl (US SSA)
><karl DOT cooper AT baesystems DOT com> wrote:
>
>> Corinna Vinschen wrote:
>>> Or try LANG=C.ASCII since LANG=C will still return UTF-8 as charset
>>> when calling nl_langinfo(CHARSET).
>>
>> Yes, this solves it:
>>
>> $ time LC_ALL=C.ASCII grep dog testfile | wc
>>  100000  900000 4500000
>>
>> real    0m0.359s
>> user    0m0.279s
>> sys     0m0.232s
>
>
> I just tried this on my system, I routinely grep groups of files
> containing 100K lines.  I was *astounded* how fast "grep" is after
> setting LC_ALL=C.ASCII !


The second run of grep is usually much faster due to disk buffering.




--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019