X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL,BAYES_00,SPF_PASS X-Spam-Check-By: sourceware.org Message-ID: <4AF4BAD2.3030703@tlinx.org> Date: Fri, 06 Nov 2009 16:09:54 -0800 From: Linda Walsh User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.22) Gecko/20090605 Lightning/0.9 Thunderbird/2.0.0.22 ThunderBrowse/3.2.6.5 Mnenhy/0.7.6.666 MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: 1.7] BUG - GREP slows to a crawl with large number of matches on a single file References: <20091106135152 DOT GK26344 AT calimero DOT vinschen DOT de> <814i7o$49eric AT dmzms99802 DOT na DOT baesystems DOT com> <806a89db0911061422l290ff84u3d58cbbe1d3eface AT mail DOT gmail DOT com> In-Reply-To: <806a89db0911061422l290ff84u3d58cbbe1d3eface@mail.gmail.com> X-Stationery: 0.4.10 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Jim Reisert AD1C wrote: > On Fri, Nov 6, 2009 at 7:12 AM, Cooper, Karl (US SSA) >> Yes, this solves it: >> $ time LC_ALL=C.ASCII grep dog testfile | wc >> 100000 900000 4500000 >> >> real 0m0.359s >> user 0m0.279s >> sys 0m0.232s > I just tried this on my system, I routinely grep groups of files > containing 100K lines. I was *astounded* how fast "grep" is after > setting LC_ALL=C.ASCII ! ---- I'm really curious if 'pcregrep' has the same problem with unicode on 1.7. Could you try it on the same input? It's based off the perl regex which has supported unicode for a few years now. -l -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple