X-Recipient: archive-cygwin AT delorie DOT com X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 7A556385782D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=cyberXpress.co.nz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=M DOT Aitchison AT cyberXpress DOT co DOT nz DKIM-Signature: v=1; a=rsa-sha1; c=simple; d=plain.co.nz; h=to:from :subject:message-id:date:mime-version:content-type :content-transfer-encoding; s=mail; bh=t8w4brQQ5sOP/86D0b37u3ckV YM=; b=Z33Q07iS4w9TyuMK1Vp3ZvHZPGDXch9qRjsjhGfZ6AmnvlKtRSuEyWUz+ YJ8WOlQ55oAjjSHPP5Xg4FYj5LAPtQvVfvk2wBDXOlMa/KmoJ3NEwFxyAHR+uLss D7aRMIqwcl7Ab9UFFFmFXFVqBhteDmMLJOlhGS7bdk+dlErSnU= To: cygwin AT cygwin DOT com From: Mark Aitchison Subject: Perl Unidecode modules - which to use (if not Text::Unidecode)? Message-ID: Date: Fri, 2 Apr 2021 09:35:31 +1300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 MIME-Version: 1.0 Content-Language: en-NZ X-Spam-Status: No, score=0.1 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, JMQ_SPF_NEUTRAL, SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.29 List-Id: General Cygwin discussions and problem reports List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8"; Format="flowed" Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 131KZfYA006352 I am writing perl programs that I'd like to know will work under both Linux and Cygwin, and have to deal with Unicode now. I had used Text::Unidecode happily in Linux but find no cygwin version. Possibly I am not looking in the right places for it, but possibly there are different Unicode-related modules that are well-supported under both cygwin and linux that I should be using instead, and I guess Unicode might be one of those things where it depends on the underlying o/s so it probably pays to go with whatever is the standard set of modules. 1. What perl Unicode modules should I consider, if not Text::Unidecode? The present need is to be able to convert those few "foreign" characters (like ÇĆĈĊçĉċĜĞĠĢĝģğġËÌÍÎÏÒÓÔÕ) that are basically ASCII with accent marks to their closest ASCII equivalents, but I'd like to do more with Unicode in the future, without going down any dead-ends as far as being able to run under cygwin is concerned. 2. I see some talk of Internationalization in Chapter 2 of "Setting up Cygwin", but cannot see anything relating to perl modules, and I don't see any easy way to search many months of the mailing list for a keyword... is there any information I should know about? Thanks, Mark Aitchison -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple