X-Recipient: archive-cygwin@delorie.com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:message-id:date:from:mime-version:to:subject
	:references:in-reply-to:content-type; q=dns; s=default; b=RwX6J2
	STzUh76FpF+NZ6/KvePgQCjNr3PJfqXYSMRqQW5HsqCvaUys/LXeOOHkWgUUf5LO
	WPCakyUwHdLTXqky7OELgDBIA+U0O3IQX/ppk43p8lukoGHMXsxQY8dRa4p3SXPp
	eQJ2Tvcn3kxdMhTRaOwFBH4PUuRg+fPCJlepg=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:message-id:date:from:mime-version:to:subject
	:references:in-reply-to:content-type; s=default; bh=g+exvrIpVV26
	O4XAauW5pV3hRK4=; b=MKIetsVBxyjzOvGz3GovJDNnAvUa9V7d0OenBu0eZniU
	TUl/uxELQCpd0cbg57jUxxx2X8vWuaADKoqJh7vut8iKUsX8hW7QwFNLma5j6HKW
	QAxDS9aRqibuGo6Ha5o/E1ipC/eZvJazGM+/34hXDMUwxSRD/FJITPh+udZf0lM=
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2
X-HELO: mx1.redhat.com
Message-ID: <54A728E8.8030703@redhat.com>
Date: Fri, 02 Jan 2015 16:25:28 -0700
From: Eric Blake <eblake@redhat.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: cygwin@cygwin.com
Subject: Re: grep treating my text files as binary!
References: <XnsA40D81CA1FAA8davidrayninfocouk@80.91.229.13> <549B4258.5050509@redhat.com> <XnsA40DECB2AE256davidrayninfocouk@80.91.229.13> <549C5A6B.2000509@towo.net>
In-Reply-To: <549C5A6B.2000509@towo.net>
OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg
Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="ansFCpfCX1sWHuUKoXbs2QfILiHLRHwO7"
X-IsSubscribed: yes

--ansFCpfCX1sWHuUKoXbs2QfILiHLRHwO7
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On 12/25/2014 11:41 AM, Thomas Wolff wrote:
> I've read the POSIX definition of "binary file" that was quoted in the
> grep bug already,
> and if I remember correctly (or how this is abbreviated here...) it does
> not mention character encoding or locale.

Ah, but it does.

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag=
_03_397

"Text File
A file that contains characters organized into zero or more lines."

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag=
_03_87

"Character
A sequence of one or more bytes representing a single graphic symbol or
control code.

Note:
    This term corresponds to the ISO C standard term multi-byte
character, where a single-byte character is a special case of a
multi-byte character. Unlike the usage in the ISO C standard, character
here has no necessary relationship with storage space, and byte is used
when storage space is discussed.

    See the definition of the portable character set in Portable
Character Set for a further explanation of the graphical representations
of (abstract) characters, as opposed to character encodings."

If you have a file that contains byte sequences that are not characters
in the current locale, then that file is NOT a text file.  It might be a
mostly-text file, and it might even be a text file if you switch to the
correct locale, but the point is that the POSIX definition of text file
is that character encoding errors in the current locale MAKE a file
binary, at which point behavior is unspecified.

--=20
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


--ansFCpfCX1sWHuUKoXbs2QfILiHLRHwO7
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Public key at http://people.redhat.com/eblake/eblake.gpg
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCAAGBQJUpyjoAAoJEKeha0olJ0NqCngIAJuDsju6O9iILrsWj5CoLVw5
yeHma8007XQhsOqDjLFC6iQukpOYhcgu0wbu7/9GUCJz+rp3zdI0mWP6ySXsV1Qt
EDlZihfIZqW1SpqGDMSO+fjxS1xHjX+pZTUHNxTJ6F6tAVzCdnia4yQ+6mKMtNqL
P9BCSfidIXzHIjhjdUx9S1lpMWgpDb7964JI5fzlznVgq/25drGqmCKGS/+0dLwy
AP4WvquqQjX/b578IRr66oQnWZ3NOLFJInk2BY0IxTrUFZV7f4yJFgPsdQY0UpKk
QWSukAKJYFA8CGwBXY6ILktckKGWpdAHxO3MZDuSmdmdTCjyDkpeNSk7j6lHby8=
=w3dA
-----END PGP SIGNATURE-----

--ansFCpfCX1sWHuUKoXbs2QfILiHLRHwO7--
