X-Recipient: archive-cygwin@delorie.com
X-Spam-Check-By: sourceware.org
Date: Sun, 8 Nov 2009 11:30:38 +0100
From: Corinna Vinschen <corinna-cygwin@cygwin.com>
To: cygwin@cygwin.com
Subject: Re: 1.7] BUG - GREP slows to a crawl with large number of matches   	on  	a single file
Message-ID: <20091108103038.GY26344@calimero.vinschen.de>
Reply-To: cygwin@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
References: <26224019.post@talk.nabble.com>  <4AF393C6.3000505@tlinx.org>  <20091106033243.GB30410@ednor.casa.cgf.cx>  <4AF42027.80604@towo.net>  <20091106135152.GK26344@calimero.vinschen.de>  <814i7o$49eric@dmzms99802.na.baesystems.com>  <806a89db0911061422l290ff84u3d58cbbe1d3eface@mail.gmail.com>  <1257632832.5773.48.camel@fast>  <26249599.post@talk.nabble.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <26249599.post@talk.nabble.com>
User-Agent: Mutt/1.5.20 (2009-06-14)
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
Precedence: bulk
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie.com@cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com

On Nov  7 15:26, aputerguy wrote:
> 
> Changing LC_ALL also solved the problem for me.
> But it begs the question of how many other basic and take-for-granted
> functions might be affected by this apparent UTF-8 slowdown. And again we,
> are not talking about some minor overhead, we are talking about a slowdown
> of 1500X or 150,000%!!!!

Yeah, that's really still strange to me.  In my testing, the multibyte
to widechar conversion performed by grep in case of UTF-8 took only
1.5 up to 4 seconds for 10 times the number of input lines as in your
case.  It still puzzles me where the time is wasted in grep.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

