X-Recipient: archive-cygwin@delorie.com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:content-type:content-transfer-encoding
	:subject:message-id:date:to:mime-version; q=dns; s=default; b=qT
	NDrof/kGtsl803zjUUSt/Z7H3x2MeeZ47QTU5Q/6RLaXGWCfmz9SeVo7Cq79MN2Q
	neroyQtpQGiInck1RTZbQY2LSfCvsWrTfXbs2HEuqPGBZBXpDlN2HvuZON0B76tw
	Xyx4d/m8Jro8HZhSodKuQuggWuQZowfX7CgtdbSGc=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:content-type:content-transfer-encoding
	:subject:message-id:date:to:mime-version; s=default; bh=L10Y4vmX
	WqQ4lV7sOl9YOowUEhA=; b=PwXTlXqhEjPpnHHuPzPph789l9P4KjGKmoyZyIZU
	uTOIUOfS8quh2lmYTdlO1FRvV/k2zIHzAxhPLZhsiJw+ylODqa5/rS5rOIqID4TP
	yYx/a11k5gChIYigwkraMiAb1PN1c+qF99+48xwFbyHV5GgjlkjGFh+jk7lKxZL3
	o40=
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2
X-HELO: gproxy6-pub.mail.unifiedlayer.com
X-Authority-Analysis: v=2.1 cv=efyuId0H c=1 sm=1 tr=0 a=x/h8IXy5FZdipniTS+KQtQ==:117 a=x/h8IXy5FZdipniTS+KQtQ==:17 a=cNaOj0WVAAAA:8 a=f5113yIGAAAA:8 a=IkcTkHD0fZMA:10 a=z1iSbGl3AAAA:8 a=CnPQkyIfcMwA:10 a=rD4U560VbWoA:10 a=h1PgugrvaO0A:10 a=9PCmaVEvOsikbrUpAocA:9 a=QEXdDO2ut3YA:10
From: Vince Rice <vrice@solidrocksystems.com>
Content-Type: text/plain; charset=utf-8
Subject: Grepping Unicode files?
Message-Id: <3C280897-291A-4A8C-8C3F-46D1D9BEFCFE@solidrocksystems.com>
Date: Thu, 14 May 2015 10:42:50 -0500
To: cygwin@cygwin.com
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2098\))
X-Identified-User: {3986:box867.bluehost.com:solidrr2:solidrocksystems.com} {sentby:smtp auth 65.118.57.199 authed with vrice@solidrocksystems.com}
X-IsSubscribed: yes
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id t4EFhKkh021945

uname says "CYGWIN_NT-6.1 machinename 1.7.35(0.287/5/3) 2015-03-04 12:07 i686 Cygwin”.
I’m running grep 2.21.2, which cygcheck -c says is OK.

Does Cygwin’s grep support Unicode files? The output from a SQL Server SQL Agent job is a Unicode file, i.e. if you look at it in a hex editor every other character is 00 because each character is taking up two bytes. The filename itself is fine, it’s the contents that is Unicode. I can’t get grep to work on it, either with or without -a.

This may not be a Cygwin-specific question, but I haven’t been able to find anything after several Google searches, including the archives, and neither --help nor the man page for grep references Unicode.

By default I have neither LC_ALL nor LC_COLLATE set.

A pointer to a better search or a website that explains this would be great, or if it can’t currently be done, that’s OK, too.

Thanks for your help!
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


