X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:references :in-reply-to:content-type:content-transfer-encoding :mime-version; q=dns; s=default; b=QSg5soqfCp9nx6KqzBcAsXDIhk3eM O/l8Yfrbg0yDGgFff1m15whiyH+eUeG1zAXAfBQ+O7kc6+6z2WAKByzyr73YlFHK MCtQmZR6SfdgePkZWfh2ScGNhs21iSxN2sKtsbR3SCUVkprqFBiWwbCmCyS/8ufE bsaMsmA3tj91O0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:references :in-reply-to:content-type:content-transfer-encoding :mime-version; s=default; bh=PQFuXPznY3NHv2gkkEJW6EMFGwo=; b=Flb TqGXCNxTNFwL4WknOUOOhOEzdgCAH16VVLo/cu16axjdYsYNRxJvbMx8UiOZ9PmO T/fcLsHbVDzsZA0fGz94kUAW74JPC9aNxwo1aI9ir4kCh1+a+j8qCesstGEAwUw2 E8A0K2u/qeEE8VvsS/AXACIA+j2vim9esrVcLR1s= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.3 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,MIME_BASE64_BLANKS autolearn=no version=3.3.2 X-HELO: USA7109MR010.ACS-INC.COM From: "Nellis, Kenneth" To: "cygwin AT cygwin DOT com" Subject: RE: Grepping Unicode files? Date: Thu, 14 May 2015 16:13:12 +0000 Message-ID: <0D835E9B9CD07F40A48423F80D3B5A702E8DCF27@USA7109MB022.na.xerox.net> References: <3C280897-291A-4A8C-8C3F-46D1D9BEFCFE AT solidrocksystems DOT com> In-Reply-To: <3C280897-291A-4A8C-8C3F-46D1D9BEFCFE@solidrocksystems.com> Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id t4EGEbiP024520 > Does Cygwin’s grep support Unicode files? The output from a SQL Server SQL > Agent job is a Unicode file, i.e. if you look at it in a hex editor every > other character is 00 because each character is taking up two bytes. The > filename itself is fine, it’s the contents that is Unicode. I can’t get > grep to work on it, either with or without -a. > > This may not be a Cygwin-specific question, but I haven’t been able to > find anything after several Google searches, including the archives, and > neither --help nor the man page for grep references Unicode. > > By default I have neither LC_ALL nor LC_COLLATE set. > > A pointer to a better search or a website that explains this would be > great, or if it can’t currently be done, that’s OK, too. > > Thanks for your help! If you don't have iconv, install the libiconv package. Then, if what your searching for is in the ascii character set, then the following should work: iconv -f utf16 -t utf8 {your file} | grep {your RE} --Ken Nellis