X-Spam-Check-By: sourceware.org Message-ID: <10845a340609260448h5a4cee19k71fc8a45248e8455@mail.gmail.com> Date: Tue, 26 Sep 2006 12:48:24 +0100 From: "Richard Quadling" Reply-To: RQuadling AT googlemail DOT com To: cygwin AT cygwin DOT com Subject: Re: grep weirdness - matching space character In-Reply-To: <078e01c6e158$0c036ac0$a501a8c0@CAM.ARTIMI.COM> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <11329 DOT 194 DOT 203 DOT 201 DOT 98 DOT 1159265802 DOT squirrel AT www DOT yankeeboysoftware DOT com> <078e01c6e158$0c036ac0$a501a8c0 AT CAM DOT ARTIMI DOT COM> Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On 26/09/06, Dave Korn wrote: > On 26 September 2006 11:17, The Blog User wrote: > > > I am really struggling to understand what I am doing wrong here. > > > > I have a log file with a line that looks like this: > > > > ++ 04:51:32 All 94 items succeeded > > > > The binary data for that line is this: > > > > 2B 2B 20 30 34 3A 35 31 3A 33 32 20 41 6C 6C 20 39 34 20 69 74 65 6D 73 20 > > 73 75 63 63 65 65 64 65 64 0A > > > > using grep and tail (versions below) I am failing to match that line > > > > $ tail -1 /path/to/file/the.log | grep -a "All \d*.items succeeded" > > There's no such thing as \d. > > > however if I insert 3 (why three?) dots (or a .*) between 'All' and '\d' I > > get a match, what is happening ? > > The dots are eating the '94' as well as the space. > > > This seems wrong to me, since - from my knowledge of regex's - that is > > saying there must be three characters between the 'All' and the first > > digit, yet I can see there is only a single space character. > > Escaping a d just matches a literal 'd'. So the expression '\d*' matches > zero or more of the letter d. If you use the three dots to eat the two digits > as well as the space, the optional any-number-of-d's is matched by the zero > d's following, and then the trailing 'items succeeded' matches. > > Whereas with only the one dot, the dot matches the space, then there's > zero-optional-'d's, then the '9' fails to match against '.items succeeded'. $ cat c:\log.log ++ 04:51:32 All 90 items succeeded ++ 04:51:32 All 91 items succeeded ++ 04:51:32 All 92 items succeeded ++ 04:51:32 All 93 items succeeded ++ 04:51:32 All 94 items succeeded $ tail -1 c:\log.log ++ 04:51:32 All 94 items succeeded $ tail -1 c:\log.log | grep -a "All [0-9]* items succeeded" ++ 04:51:32 All 94 items succeeded Is that what you wanted? -- ----- Richard Quadling Zend Certified Engineer : http://zend.com/zce.php?c=ZEND002498&r=213474731 "Standing on the shoulders of some very clever giants!" -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/