Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com To: cygwin AT cygwin DOT com From: Robert Schmidt Subject: Re: file conversion utility sought: from isolatin (8859-1) to utf8 Date: Tue, 28 Sep 2004 12:57:49 +0200 Lines: 37 Message-ID: <415943AD.6030000@broadpark.no> References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet AT sea DOT gmane DOT org X-Gmane-NNTP-Posting-Host: dhcp-068-039.oslo.eur.slb.com User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913) In-Reply-To: X-IsSubscribed: yes Ralf Hauser wrote: > Hi, > > Are there any tools like d2u or u2d for UTF-8 for cygwin? > ... > > A starting point might be > http://userpage.fu-berlin.de/~ram/pub/pub_kfd8tk88g/perl_unicode_en ? Not particularly cygwin related, but anyway... This is a better start: http://www.perldoc.com/perl5.8.0/lib/Encode.html #!/usr/bin/perl # iso2utf8.pl use Encode; while(){ print encode("utf8", decode("iso-8859-1", $_)); } Then #!/bin/sh mkdir -p utf8 for FILE in $* ; do iso2utf8.pl < $FILE > utf8/$FILE ; done If you're sure you want in-place, finish off with mv utf8/* . If you need to handle a hierarchy of files, you need to fiddle with find -print0 | xargs -0, or keep it all in perl. I'm not a perl wiz, Cheers, Rob -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/