delorie.com/archives/browse.cgi | search |
X-Recipient: | archive-cygwin AT delorie DOT com |
DomainKey-Signature: | a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id |
:list-unsubscribe:list-subscribe:list-archive:list-post | |
:list-help:sender:content-type:mime-version:subject:from | |
:in-reply-to:date:content-transfer-encoding:message-id | |
:references:to; q=dns; s=default; b=nXn7UBq6kSBgZ1BxSGQME5qy82uy | |
SoX1UWebQ+Fuvs1/uK+W7DfM8Ev5lF7SpCHdRuIDjpiR/Ix3o82kdWTsVLfXem6t | |
jvy+2VfBmWUMqQPDpqkp9G21PVnzHaUfKBQny/JWb9ETpf+E36y+Aq7+ACUnN7KZ | |
occ+PFuyV66noKA= | |
DKIM-Signature: | v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id |
:list-unsubscribe:list-subscribe:list-archive:list-post | |
:list-help:sender:content-type:mime-version:subject:from | |
:in-reply-to:date:content-transfer-encoding:message-id | |
:references:to; s=default; bh=y2WHt1xLOAgvUyuQTCx+oxGfADQ=; b=wW | |
xV91+NYKlRgWw4q2gzp68OcTDBTdiErrLdZPALUIAeQKlSXNb26d2dOmZpC888V4 | |
c+ydWBeYmVhVLcM+GsjKQJaWPuzWpuE0eTQvZPvG7fBxufBaa0wZ0dw3MlA95XZ4 | |
iiAW/0kl+D7UvwAED4S+7r5VPxNikGPVBCCS2eQqo= | |
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
List-Id: | <cygwin.cygwin.com> |
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
List-Archive: | <http://sourceware.org/ml/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
Sender: | cygwin-owner AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
Delivered-To: | mailing list cygwin AT cygwin DOT com |
Authentication-Results: | sourceware.org; auth=none |
X-Virus-Found: | No |
X-Spam-SWARE-Status: | No, score=0.1 required=5.0 tests=AWL,BAYES_50,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 |
X-HELO: | etr-usa.com |
Mime-Version: | 1.0 (Mac OS X Mail 7.3 \(1878.6\)) |
Subject: | Re: grep treating my text files as binary! |
From: | Warren Young <wyml AT etr-usa DOT com> |
In-Reply-To: | <549C5A6B.2000509@towo.net> |
Date: | Fri, 26 Dec 2014 19:00:38 -0700 |
Message-Id: | <27CE6A0A-9845-4A1C-A0F8-C0236B95A1E3@etr-usa.com> |
References: | <XnsA40D81CA1FAA8davidrayninfocouk AT 80 DOT 91 DOT 229 DOT 13> <549B4258 DOT 5050509 AT redhat DOT com> <XnsA40DECB2AE256davidrayninfocouk AT 80 DOT 91 DOT 229 DOT 13> <549C5A6B DOT 2000509 AT towo DOT net> |
To: | The Cygwin Mailing List <cygwin AT cygwin DOT com> |
X-IsSubscribed: | yes |
X-MIME-Autoconverted: | from quoted-printable to 8bit by delorie.com id sBR212uq005437 |
On Dec 25, 2014, at 11:41 AM, Thomas Wolff <towo AT towo DOT net> wrote: > In any case the argument is quite artificial since the new behaviour > hits many files that are in fact text files. Please define the term “text file” in a way that allows a C programmer to write a program that automatically does the correct thing for all members of the class “text file” without involving locales, or an equivalent mechanism. Just because you, the human, can use your superior pattern matching skills to see that a file is a “text file” doesn’t mean that a relatively dumb program like grep(1) can. You can’t expect someone to build an AI system into grep(1) just to get automatic locale detection. If grep runs into a byte sequence that makes it think it is not legal for your current locale, it must treat the file as raw bytes, unless you give it -a. If you don’t like this behavior, say “alias grep=grep -a” in your ~/.bashrc, and forget the change ever happened. It’ll be on you when some non-text file gets treated as text and grep spams your terminal with binary garbage, though. This isn’t really a Cygwin problem. It just happens to affect it more than other *ix systems because there are two sets of rules on the same system, and they may conflict. But, if I go and copy a UTF-16LE file to a Linux box, I’m not going to complain to the grep bug list when grep doesn’t automatically do the right thing with it while $LANG contains UTF-8. Ultimately, the proper solution is to use UTF-8 on all systems you use Cygwin on. Many Unicode-aware native Windows programs that deal with text files can cope with UTF-8. If you have one of those that demands UTF-16LE, iconv(1) can make the conversion for you. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |