X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Message-ID: <4AF43F97.8040801@towo.net> Date: Fri, 06 Nov 2009 16:24:07 +0100 From: Thomas Wolff User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: 1.7] BUG - GREP slows to a crawl with large number of matches on a single file References: <26224019 DOT post AT talk DOT nabble DOT com> <4AF393C6 DOT 3000505 AT tlinx DOT org> <20091106033243 DOT GB30410 AT ednor DOT casa DOT cgf DOT cx> <4AF42027 DOT 80604 AT towo DOT net> <20091106135152 DOT GK26344 AT calimero DOT vinschen DOT de> <4AF42B15 DOT 9050100 AT byu DOT net> <20091106142644 DOT GL26344 AT calimero DOT vinschen DOT de> <4AF439F0 DOT 8060203 AT towo DOT net> In-Reply-To: <4AF439F0.8060203@towo.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com I wrote: > Corinna Vinschen wrote: >> ... > I extended your test program to demonstrate the inefficiency of the > standard mbrtowc function. Instead I use a function from my editor > (mined) to extract a Unicode character from a UTF-8 sequence. This is > the simple case only, not converting character sets other than UTF-8 > but that's the same thing mbrtowc does in the sample invocation. > Program attached. Results below. Actually, there was a bug in the test program, wc not being an array, which led to variable corruption and thus incorrect test results in my extension. Sorry for my embarrassing mistake to overlook this. Anyway, corrected results are still by a factor of 3 to 4 in favor of my algorithm. Thomas -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple