delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2024/03/16/15:09:23

X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B56F2385840D
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1710616160;
bh=SwZYzZ2jea+NfjpEKkwkKzNca6PkJn3Qq3ogqX2XEwU=;
h=Date:To:In-Reply-To:References:Subject:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=NmgipBObb6YBij3p/LGUNguGbITElNoZtiyOkbQZdCjXliGRNF8uUsAiIcNsiSBw9
TwDLP2nj+Pg6Mjtczq+YahuUHivR21ra5e8MaUFmuhCfttcO5CXSlXqxL015uaqj2z
r+Mje83AZ1LOYNMK5NIrQNKciJBQ2wzFA1SnHIYI=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7219B3858C50
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7219B3858C50
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710616139; cv=none;
b=d22XkGtlE4Iiu1HDAZH3qr7u4MuW4xlLtTqAbQO2wklDqmvRZhkVCCxVGyFCJ4kDdccIgaZktxf8+DjiWt35nhviZavDq9IkncK1S56TTlYfwqCoxWID6eCsK3S4EMygYgfnCvso3izK6ywTk1PWlGYMb7u0G4NbY93RhPH0PRw=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1710616139; c=relaxed/simple;
bh=IT4WHcBf4tqizCuEQIJaUxwc+cP0xulRp18Dadoh7M8=;
h=DKIM-Signature:Date:From:To:Message-ID:Subject:MIME-Version;
b=GsHyJsURvAKOmbdZ6VBeHHzP71trffMr7WopgpKXEwrUB+8E1sfNCK8e1MXjgesZsaOOL8daBcDroJJXf3m44EkQlYGuvhXVeJvo2R3u4Oj+uosgq+KebSx/CdY3x5/2l6mGZJSR+aG4URoBvkF6GXJn/ephUNCzIHdTFCq/GSE=
ARC-Authentication-Results: i=1; server2.sourceware.org
X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048;
t=1710616136; bh=w5GeKPEOTPxsjtlcmq+vvjfehUiZ9j6GASnFpAvbVSp=;
h=X-Sonic-MF:Date:From:To:Subject:From:Subject;
b=tfivok8XVgwkWZSXXbvEz/sMdpQoJex/u64doEPEOXwXnhbbOL5sB/KV1ov7VYRcJ3O2RNBFljx7fypV38ywnvofSCxB5t0XWyYKb4zx6ThyrsnyzGtZ5KcPDb4MSUKm0ZdpXtsyhES3kpQQV0kk3ujoVVzQrZX66Z82aVshRtSy10GsWoPX4kx3xTytKOkhmd+HA+VykTPLT/0yko7Bd4ALpIsSmPz/LkNw5Ndzu1HGCgK3/UscdiDSkdMEgEqWOsazQF/ODoOfDwU6jFs+IHJehQz9S+hNoF8nKpoqHI+bEWGaLFwX5R/itI/RkwX4RRpq1QAsnf8GgWNUS37Tpg==
X-YMail-OSG: OFHbg8sVM1k6S0Nm5F4YvPGxJb7vdmuD_0wKVD4OurqitfGcSvM7ZLH2K3V4wam
Qhb5H6XLOctCDWr7o7ukz9m98pEmTbx2GDFX5bof9QNlO5gZuUQbUECuITiyNePRM.ahZXwSoGev
yxLv7MVsBvB5HnRNUE5K.k.fp9UGoxnCAo0ipX1UjV9xCGQWF1MRgDQej.w9O7kToo8Ej3bmyikc
kGnw3Cvq_LgbKkBzBPKz7pTMfs7_Gx2tKWpQBjO1IdtbVsZiILJBOM18T6buK..tsaewi8RRz9OH
0cTP3c3TvSLtcKsB35WjbVBSCq8leZLSeb7OdNCGXVt0OaXK.f8tujbqPAhxZX9fQHv4LPAI8E0x
ihC7RBu1JCOJeHQ8iuTjvbGA0t2zJu1fTyqRPYYjtxPYRmCib7jIPYtzMZrlZ6dZ3.vn2IZ.i3zO
1sXaecxx1VThqAN0JJJ_tauuXA4_4Dro1URcshFG7MqGcC.YjPPEDOUhnBSrAtW372oPD6Md4aXS
gcBl9qSi3u43UhNd87CCIUsBUhUOQHB7XyDQNfputmh.RNEXpR4PSkStr5JUNLD183qBqrIYHr.k
OvvNDBuEH2QAFtbkMkvJbUI3sDMi786r1xttBsCweLeKrS.E35PMg4vqlkr.jNC75iBCaIH4GZmg
E4k9nb1AF64acB4nkO0CoVl6bL41DRpTWJnoJFS73lleaSUYIviLs_v6TD91UEu94xQN4._uiVOs
v2eLSt4uSX8UpiOQPpzxd7a79zwNnWsMTv_SABk3HecqTQhoodFwsg.xkVjizql2QA0Q4ChMXTdN
86ZSTBMSQhNa4cRFYfX7BZMdTekXGLK1lCN5cSywAnIEJbqlNZr0.Swkr3WYPE4oq5fZh80DDXg8
mre7lbsvjs.C5hdN7geaGuxVXG7FmMm.QOaVQkV4LV24AMkBLtMKA9z4Kmmyqk6_bzKZrcu7sm5G
U0hBM6R_WrHfCtXKTAsTmWJJ9CjCzZPZkaY7s9YMMmypoqd.wbRAfb3NqI_gk8n3rVgdxtAvA5nQ
s4aNeoDrH14SZktheJvvwE7bDihsITdG8B957afJMwEzB5FuSBEinoyPVKTZ0RZ_3TxMkbDil1nF
b4qWSNsuuTCQ6gpc2XpMEnvNaT77x.DMyvvqfvBUfEPxwMKgXiWaw7_ntWNm4cOXFvNvzHf6HMIx
HtWOHi.SmsCQQZZjO4MiLJ2qB4uiiYYnzBArfN10eBqhT5LM1veVQFxOclg.r3u4sBFwW76fyZPY
lVnugFIby5lJOBtUkYQlOetOp9dcucGOmn5lAR1hqmZwsMMr3txnmtD7s9AQymytFIdsmvizRgpo
ILsZ1w8VfeKYY4dbPS.fB9zqEvy8TaLQLkbPCJxMNjQZwotkcDtfPxcMDrlUJNhTolnQZB41PWGb
xgN0huT52kj3zhvBcrBGm6UhrTwTG3lmzpeS9Q.lZGSh5syk49ydRiLTf.vZ2FNy3hvdoMpIs6qq
wN2IQuDugyuZ1o6HrNC0NwMlH5cyQRfWXN2xyUYmkZV_vYuKnjsFvi7BWjbXU5eSv0vrFDRoShS1
E4we6n4Pe2AouOMOPWRUX0EfcDc.p6rIo78XNNgox3Qho6rXdGUzkJGCh8lT3fr2gBkHnAba6AQM
3U_NZbBY52UOhbc2BFal8awsJNqaZDJeWIp3yAE34sWbrmfHPAEH0mgA.Mw5V_3ogn3wSp.3pN0r
Ggzv5wKh8siAeAb_rVs6_un0jsn_6hqwJgV8VAvHOijC9LCwdtTch9oYkbMjZ8QEauAf3iMSr.Im
WYCC.thLbRKEMNJ2xYIilEakodEU_ibh6qV.ZigCdxqTjEUn1YJJWKOpad4mGw9hPK0211Xjc2TA
0ClrKfMIwZKHMKwXMxNNZFRo_gl6TukIA2OJr7gJ2UeoZHGwLadpvPENTYUaTimp2H3ORc6qDvOK
MeHjQZDQ4r4zLazAWBhwoNQfyWVYWFntwhyE8ixNuy3m.oVz7Sp3rRiiQ4dPMDbOnM0mHK.HbMng
AswSBUSJtpXaTe_Ve26uso1e1ixXYToKE6u_SBIgIWVgr97lFsW8MnkxGOCoTGICRsKhyK.XWhET
B_7eiqBAQ1vKPZaPYNofSd34zM.H_gmLOMMEZ8mac2Bl.gkT3ra.F1tK2uq.T3OpywgAVPZ3.mEN
iiZMsahkMImScz39Qbl8L5lxbydZhkQdum8khmtgvhOt1MlfaLVZ3GRh.gsfTBwvpHbjZzf2MBmF
aiPvXxsN7yPeqsvd_gbwAYI1b2GvR.HNXS2ZWkjbok3qbCU6KdcslmczgMB5x8X6uWDx.
X-Sonic-MF: <kometes AT yahoo DOT com>
X-Sonic-ID: 9fed3e42-2d03-4160-af51-3feedd6926e1
Date: Sat, 16 Mar 2024 19:08:35 +0000 (UTC)
To: "cygwin AT cygwin DOT com" <cygwin AT cygwin DOT com>
Message-ID: <115196859.4162286.1710616115289@mail.yahoo.com>
In-Reply-To: <CAEFTnVP=CV_7v=-yRgfsaxk6qbRE=s_LPPZonKPW8CdLiEvfZw@mail.gmail.com>
References: <CAEFTnVP=CV_7v=-yRgfsaxk6qbRE=s_LPPZonKPW8CdLiEvfZw AT mail DOT gmail DOT com>
Subject: Re: The grep 3.11 application when used in perl-regexp mode appears
to now be broken
MIME-Version: 1.0
X-Mailer: WebService/1.1.22129 YMailNorrin
X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,
DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT,
RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP,
T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Kevin Schnitzius via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Kevin Schnitzius <kometes AT yahoo DOT com>
Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 42GJ9Mu23864856

On Saturday, March 16, 2024 at 02:02:31 PM EDT, Michael Goldshteyn via Cygwin <cygwin AT cygwin DOT com> wrote:

> $ grep -c -P '000$' a
> 0

> # Now you may be thinking, OK, it's because of the CR/LF line ending

$ LC_ALL=en_US grep -c --binary-files=text -P '000$' a
0
$ LC_ALL=en_US grep -c --binary-files=text -P '000\r$' a
1

It is the an EOL issue; it is also a bug.  

"By default, under MS-DOS and MS-Windows, grep guesses
whether a file is text or binary as described for the  --binary-files  option.   If
grep decides the file is a text file, it strips the CR characters from the original
file  contents  (to  make  regular  expressions  with  ^  and  $  work  correctly)."

The current release is not stripping EOL characters correctly in the case of DOS text files.

Kevin






On Saturday, March 16, 2024 at 02:02:31 PM EDT, Michael Goldshteyn via Cygwin <cygwin AT cygwin DOT com> wrote: 





I just updated my Cygwin64 installation, which includes the grep
utility and its behavior has changed. It no longer works like it used to
for Perl reg-ex matching, as demonstrated below:

Simple test cases:
======================
$ ls -l a
-rwxr-xr-x 1 Michael None 6 Mar 16 12:15 a

$ hexdump -C a
00000000  31 30 30 30 0d 0a                                |1000..|
00000006

# Notice the CR/LF encoding after the "1000" text, as is the case for DOS
text files

# Now let's test grep regular match
$ grep --version
grep (GNU grep) 3.11
Packaged by Cygwin (3.11-1)
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others; see
<https://git.savannah.gnu.org/cgit/grep.git/tree/AUTHORS>.

grep -P uses PCRE2 10.43 2024-02-16

$ grep '000' a
1000

# Match using pcre2
$ grep -P '000' a
1000

# OK, so far so good
$ grep -P '000$' a
# No match

# Put another way
$ grep -c -P '000$' a
0

# Now you may be thinking, OK, it's because of the CR/LF line ending
# But, I present the following
$ pcre2grep --version
pcre2grep version 10.43 2024-02-16

$ pcre2grep '000$' a
1000

# As a further cross-check, the same version of the cygpcre2-8-0.Dll is
used for both grep.exe and pcre2grep.exe, as shown below with an "=>"
annotation added by me to direct you to the Dll in question:

$ ldd grep.exe
        ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7ffa87d50000)
        KERNEL32.DLL => /cygdrive/c/Windows/System32/KERNEL32.DLL
(0x7ffa87700000)
        KERNELBASE.dll => /cygdrive/c/Windows/System32/KERNELBASE.dll
(0x7ffa85570000)
        cygwin1.dll => /usr/bin/cygwin1.dll (0x7ff9c84d0000)
        cygintl-8.dll => /usr/bin/cygintl-8.dll (0x5ee2d0000)
=>        cygpcre2-8-0.dll => /usr/bin/cygpcre2-8-0.dll (0x5ec2b0000)
        cygiconv-2.dll => /usr/bin/cygiconv-2.dll (0x3dff10000)

$ ldd pcre2grep.exe
        ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7ffa87d50000)
        KERNEL32.DLL => /cygdrive/c/Windows/System32/KERNEL32.DLL
(0x7ffa87700000)
        KERNELBASE.dll => /cygdrive/c/Windows/System32/KERNELBASE.dll
(0x7ffa85570000)
=>        cygpcre2-8-0.dll => /usr/bin/cygpcre2-8-0.dll (0x5ec2b0000)
        cygbz2-1.dll => /usr/bin/cygbz2-1.dll (0x3ed560000)
        cygwin1.dll => /usr/bin/cygwin1.dll (0x7ff9c84d0000)
        cygz.dll => /usr/bin/cygz.dll (0x5ebb10000)

# For what it's worth, I also checked into what versions of libintl8 and
libiconv-2 I have, and these are as follows:
# libintl8 0.22.4-1
# libiconv2 1.17-1

# And as an addition cross-check, I will include the following "complete
hack":
$ strings cygintl-8.dll | pcre2grep '^\d\.\d\d'
0.22.4
0.22.4

$ strings cygiconv-2.dll | pcre2grep '^\d\.\d\d'
1.17
1.17

# For completeness, here is my CYGWIN environment variable setting and some
other info:
$ echo "$CYGWIN"
glob:ignorecase winsymlinks:native pipe_byte
$ echo "$CYGWIN64_DIR"
c:\cygwin64
$ which grep
/usr/bin/grep
$ which pcre2grep
/usr/bin/pcre2grep
# No aliases are set up for these, either
$ alias grep pcre2grep
bash: alias: grep: not found
bash: alias: pcre2grep: not found
======================
Further comments:
I do not know with which version of grep.exe this misbehavior (or at least
misaligned behavior with respect to grep2pcre) of the '-P' switch began. I
discovered it after updating my Cygwin64 install to use the latest grep
version, which likely also picked up the latest version of PCRE2 and
other dependencies along the way.

Thank you for looking into this and/or providing constructive comments on
the source of the issue,

Michael Goldshteyn

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:    https://cygwin.com/ml/#unsubscribe-simple

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019