delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2015/05/14/12:14:39

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:from:to:subject:date:message-id:references
:in-reply-to:content-type:content-transfer-encoding
:mime-version; q=dns; s=default; b=QSg5soqfCp9nx6KqzBcAsXDIhk3eM
O/l8Yfrbg0yDGgFff1m15whiyH+eUeG1zAXAfBQ+O7kc6+6z2WAKByzyr73YlFHK
MCtQmZR6SfdgePkZWfh2ScGNhs21iSxN2sKtsbR3SCUVkprqFBiWwbCmCyS/8ufE
bsaMsmA3tj91O0=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:from:to:subject:date:message-id:references
:in-reply-to:content-type:content-transfer-encoding
:mime-version; s=default; bh=PQFuXPznY3NHv2gkkEJW6EMFGwo=; b=Flb
TqGXCNxTNFwL4WknOUOOhOEzdgCAH16VVLo/cu16axjdYsYNRxJvbMx8UiOZ9PmO
T/fcLsHbVDzsZA0fGz94kUAW74JPC9aNxwo1aI9ir4kCh1+a+j8qCesstGEAwUw2
E8A0K2u/qeEE8VvsS/AXACIA+j2vim9esrVcLR1s=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=0.3 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,MIME_BASE64_BLANKS autolearn=no version=3.3.2
X-HELO: USA7109MR010.ACS-INC.COM
From: "Nellis, Kenneth" <Kenneth DOT Nellis AT xerox DOT com>
To: "cygwin AT cygwin DOT com" <cygwin AT cygwin DOT com>
Subject: RE: Grepping Unicode files?
Date: Thu, 14 May 2015 16:13:12 +0000
Message-ID: <0D835E9B9CD07F40A48423F80D3B5A702E8DCF27@USA7109MB022.na.xerox.net>
References: <3C280897-291A-4A8C-8C3F-46D1D9BEFCFE AT solidrocksystems DOT com>
In-Reply-To: <3C280897-291A-4A8C-8C3F-46D1D9BEFCFE@solidrocksystems.com>
MIME-Version: 1.0
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id t4EGEbiP024520

> Does Cygwin’s grep support Unicode files? The output from a SQL Server SQL
> Agent job is a Unicode file, i.e. if you look at it in a hex editor every
> other character is 00 because each character is taking up two bytes. The
> filename itself is fine, it’s the contents that is Unicode. I can’t get
> grep to work on it, either with or without -a.
> 
> This may not be a Cygwin-specific question, but I haven’t been able to
> find anything after several Google searches, including the archives, and
> neither --help nor the man page for grep references Unicode.
> 
> By default I have neither LC_ALL nor LC_COLLATE set.
> 
> A pointer to a better search or a website that explains this would be
> great, or if it can’t currently be done, that’s OK, too.
> 
> Thanks for your help!

If you don't have iconv, install the libiconv package.

Then, if what your searching for is in the ascii character set,
then the following should work:

iconv -f utf16 -t utf8 {your file} | grep {your RE}

--Ken Nellis

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019