delorie.com/archives/browse.cgi | search |
X-Recipient: | archive-cygwin AT delorie DOT com |
X-SWARE-Spam-Status: | No, hits=-6.9 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,T_RP_MATCHES_RCVD |
X-Spam-Check-By: | sourceware.org |
Message-ID: | <4D9A43E4.50305@redhat.com> |
Date: | Mon, 04 Apr 2011 16:19:16 -0600 |
From: | Eric Blake <eblake AT redhat DOT com> |
User-Agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110307 Fedora/3.1.9-0.39.b3pre.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.9 |
MIME-Version: | 1.0 |
To: | cygwin AT cygwin DOT com |
Subject: | Re: grep problem? |
References: | <CAD31248BE809F4A869854C597A558EE9433F274 AT TROUX-EX01 DOT hq DOT troux DOT com> |
In-Reply-To: | <CAD31248BE809F4A869854C597A558EE9433F274@TROUX-EX01.hq.troux.com> |
OpenPGP: | url=http://people.redhat.com/eblake/eblake.gpg |
X-IsSubscribed: | yes |
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
List-Id: | <cygwin.cygwin.com> |
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
List-Archive: | <http://sourceware.org/ml/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
Sender: | cygwin-owner AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
Delivered-To: | mailing list cygwin AT cygwin DOT com |
--------------enig43BC6E6DE4AD495DF777B6FE Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 04/04/2011 04:09 PM, Jim Garrison wrote: > I'm getting weird behavior from grep. Searching for a bracketed range of = characters (i.e. [A-Z]) is doing case-insensitive matching, while an identi= cal but explicit character set match (i.e. [ABCDE...Z]) does not. Your problem is not with grep, but with your LC_COLLATE settings (which inherit from LC_ALL). POSIX states that range expressions (such as [A-Z]) are undefined in any locale except C; and some locales (like en_US.UTF-8) happen to treat A-B as AaB, A-b as AaBb, and so forth (that is, they collate case-insensitively). >=20 > $ grep '[a-b]' test.dat > abcde > ABCDE So, in a case-insensitive collation, this range expression includes at least one of A or B (but probably not both); and since that matches the ABCDE line, you get a correct result for the collation locale you requested. >=20 > Contrast with the correctly-working examples below >=20 > $ grep '[ab]' test.dat > abcde Here, there's no range, so there's no ambiguity. Also, try "LC_ALL=3DC grep '[a-b]' test.dat" to see a difference. --=20 Eric Blake eblake AT redhat DOT com +1-801-349-2682 Libvirt virtualization library http://libvirt.org --------------enig43BC6E6DE4AD495DF777B6FE Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJNmkPkAAoJEKeha0olJ0NqgJIH/ioD8/TqSao22mBQIrZjOWoG hz+DHmBW9WHfcXFo3WY4iX2Fq9GTAmBKCOpYhymfiWbVVMOVmexFIWRlrFUQ4uPx pdGuaVqb5VMh0UNGazF8nrT4/0I1a0uF8C0SWZ3OqucB5w71nNA2YhMuDMkAYZZN PCYdy4WsnCXHG/6UK50k+YdswEN+njgPrYOE+VPqJOZ+UTA0cUNIKVbz6Va7C9Eq x+3zSTBNge9OAORE+Vo9Pc04D1YGQfNVVf+vkKM0JH5FKwpYaMseSJngHDLgDy2t LSWlz3l0lS2Bfhprnxm/EETq+69+DbboJ419Z2lU1caQDMfUmeyYPw7TA+GMOz4= =BvCW -----END PGP SIGNATURE----- --------------enig43BC6E6DE4AD495DF777B6FE--
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |