delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2014/12/27/05:07:55

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:from:to:subject:date:reply-to:message-id
:references:in-reply-to:mime-version:content-type
:content-transfer-encoding; q=dns; s=default; b=d36Akp6HSXuHOQ44
5S1ooy2mO6Lne3eufxn6xwTyE4VncR/oCqnU+o3nqJnGLl8Q5orflrLxcsKRQhQC
BzOhvXu3QRlmjC8YYV9GKnOrTG7gZQ8p4og9J3Ck0oFREZzqbOq3+G1hIxOJrkZM
EOsvnjTBx9Sr4dZrjlTF7pvzG8E=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:from:to:subject:date:reply-to:message-id
:references:in-reply-to:mime-version:content-type
:content-transfer-encoding; s=default; bh=7BNeBljXO+J4oanSfg+1sy
J2XEw=; b=KJR8YLY6XGyFa0ubr2v0KhieFJm8J7fjqxvuGW0Yeq7B+ysnSuq0S0
XW2A6LT6AX7wEzxp9MV/XMSugNs0dY2HgeMZCIBuscwOmc6OuXk6sbpMub3KTik6
H8KGAvB47QiG/0vtfnTfr7Vdqc7pfxo0cdS3g+21DW4tsFXuoGpGc=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2
X-HELO: homiemail-a108.g.dreamhost.com
From: Bengt Larsson <lists DOT cygwin4 AT bengtl DOT net>
To: cygwin AT cygwin DOT com
Subject: Re: grep treating my text files as binary!
Date: Sat, 27 Dec 2014 11:07:27 +0100
Reply-To: cygwin AT cygwin DOT com
Message-ID: <evvs9ad871u11otrmfmt9hqfb7a3n0ehd0@4ax.com>
References: <XnsA40D81CA1FAA8davidrayninfocouk AT 80 DOT 91 DOT 229 DOT 13> <549B4258 DOT 5050509 AT redhat DOT com> <XnsA40DECB2AE256davidrayninfocouk AT 80 DOT 91 DOT 229 DOT 13> <549C5A6B DOT 2000509 AT towo DOT net> <27CE6A0A-9845-4A1C-A0F8-C0236B95A1E3 AT etr-usa DOT com>
In-Reply-To: <27CE6A0A-9845-4A1C-A0F8-C0236B95A1E3@etr-usa.com>
User-Agent: ForteAgent/7.20.32.1218
MIME-Version: 1.0
X-IsSubscribed: yes
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id sBRA7lUi008098

Warren Young wrote:
>On Dec 25, 2014, at 11:41 AM, Thomas Wolff <towo AT towo DOT net> wrote:
>
>> In any case the argument is quite artificial since the new behaviour
>> hits many files that are in fact text files.
>
>Please define the term “text file” in a way that allows a C programmer
>to write a program that automatically does the correct thing for all
>members of the class “text file” without involving locales, or an
>equivalent mechanism.
...
>If grep runs into a byte sequence that makes it think it is not legal
>for your current locale, it must treat the file as raw bytes, unless you
>give it -a.
>
>If you don’t like this behavior, say “alias grep=grep -a” in your
>~/.bashrc, and forget the change ever happened.  It’ll be on you when
>some non-text file gets treated as text and grep spams your terminal
>with binary garbage, though.

It's better to use the "alias grep='LC_ALL=C grep'" method. It keeps the
old way of detecting binaries (for example it detects an .EXE as binary)
while allowing you to match mostly-ASCII files with some
mismatched-locale characters. The definition you ask for is already in
the code. For us non-english people detecting what is "mostly ASCII" is
mostly right, at least interactively.

I ran into this, actually. I keep a list of my directories and it is in
CP1252 for reasons of interfacing with CMD.EXE. Suddenly grep couldn't
match it. But I figured something was up and set my locale to CP1252 and
then it worked.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019