X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00 X-Spam-Check-By: sourceware.org Subject: Re: 1.7] BUG - GREP slows to a crawl with large number of matches on a single file From: Richard Foulk To: cygwin AT cygwin DOT com In-Reply-To: <806a89db0911061422l290ff84u3d58cbbe1d3eface@mail.gmail.com> References: <806a89db0911061422l290ff84u3d58cbbe1d3eface AT mail DOT gmail DOT com> Content-Type: text/plain Date: Sat, 07 Nov 2009 12:27:11 -1000 Message-Id: <1257632832.5773.48.camel@fast> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Jim Reisert wrote: >On Fri, Nov 6, 2009 at 7:12 AM, Cooper, Karl (US SSA) > wrote: > >> Corinna Vinschen wrote: >>> Or try LANG=C.ASCII since LANG=C will still return UTF-8 as charset >>> when calling nl_langinfo(CHARSET). >> >> Yes, this solves it: >> >> $ time LC_ALL=C.ASCII grep dog testfile | wc >> 100000 900000 4500000 >> >> real 0m0.359s >> user 0m0.279s >> sys 0m0.232s > > > I just tried this on my system, I routinely grep groups of files > containing 100K lines. I was *astounded* how fast "grep" is after > setting LC_ALL=C.ASCII ! The second run of grep is usually much faster due to disk buffering. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple