delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2011/09/22/10:33:19

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=0.4 required=5.0 tests=AWL,BAYES_20,RCVD_NUMERIC_HELO,RP_MATCHES_RCVD,SPF_HELO_PASS
X-Spam-Check-By: sourceware.org
To: cygwin AT cygwin DOT com
From: Oleksandr Gavenko <gavenko AT bifit DOT com DOT ua>
Subject: Re: 'cygcheck -f' pattern syntax.
Date: Thu, 22 Sep 2011 17:32:38 +0300
Lines: 102
Message-ID: <j5fgu0$2on$1@dough.gmane.org>
References: <j5f8io$2uq$1 AT dough DOT gmane DOT org> <4E7B2D8F DOT 409 AT gmail DOT com>
Mime-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:6.0.2) Gecko/20110902 Thunderbird/6.0.2
In-Reply-To: <4E7B2D8F.409@gmail.com>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Note-from-DJ: This may be spam

22.09.2011 15:43, Marco atzeri пишет:
> On 9/22/2011 2:10 PM, Oleksandr Gavenko wrote:
>> So 'cygcheck -f' does not allow 'glob' and 'regex'.
>>
>> I write simple script that allow use regex:
>>
>> #!/bin/sh
>>
>> regex=$(echo "$1" | sed -e 's|\\|&&|g' -e 's|=|\\=|g')
>>
>> for file in /etc/setup/*.lst.gz; do
>> name=${file#/etc/setup/}
>> name=${name%.lst.gz}
>> gzip -d -c $file | sed -n "\=$regex={s=^=$name: /=;p;}"
>> done
>>
>> But this script is extremely slow:
>>
>> $ time ./cygsearch.sh 'bin/emacs'
>> emacs: /usr/bin/emacs-nox.exe
>> emacs: /usr/bin/emacs.ico
>> emacs: /usr/bin/emacsclient.exe
>> emacs-X11: /usr/bin/emacs-X11.exe
>>
>> real 0m38.797s
>> user 0m44.620s
>> sys 0m25.574s
>>
>
> much faster
>
> $ time cp /etc/setup/*.gz .
>
> real 0m1.450s
> user 0m0.061s
> sys 0m1.202s
>
> $ time gunzip *.gz
>
> real 0m1.937s
> user 0m0.170s
> sys 0m1.764s
>
> $ time grep -H "stdio.h" *.lst
> cygwin.lst:usr/include/stdio.h
> cygwin.lst:usr/include/sys/stdio.h
>
> real 0m0.816s
> user 0m0.046s
> sys 0m0.781s
>
I try eliminate sed call by using internal bash capabilities but
unsuccessfully:

   for file in /etc/setup/*.lst.gz; do
     name=${file#/etc/setup/}
     name=${name%.lst.gz}
     gzip -d -c $file | while read line; do
       if [[ /"$line" =~ $regex ]]; then
         echo $name: "$line"
       fi
     done
   done

   $ time ./cygsearch 'bin/emacs'
time ./cygsearch 'bin/emacs'
emacs: usr/bin/emacs-nox.exe
emacs: usr/bin/emacs.ico

real	0m57.955s
user	0m42.105s
sys	0m34.710s


Next I try eliminate spawning any process at all with success:

   #!/usr/bin/python

   import sys
   import glob
   import re
   import gzip

   r = re.compile(sys.argv[1])
   n = re.compile(r'/etc/setup/(.*)\.lst\.gz')

   for f in glob.glob(r'/etc/setup/*.lst.gz'):
       name = n.match(f).group(1)
       plain = gzip.open(f, "rb")
       for line in plain:
           line = line.rstrip('\n\r')
           if r.search(line) != None:
               print '%s: %s' % (name, line)
       plain.close()

   $ time ./cygsearch.py 'bin/emacs'
emacs-X11: usr/bin/emacs-X11.exe
emacs: usr/bin/emacs-nox.exe

real	0m3.152s
user	0m2.421s
sys	0m0.171s


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019