delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2021/04/04/17:26:39

X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 43EB23857C6D
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1617571595;
bh=mkENOWq8zPZ9Y5dUzGZp4El63ApskBNKioCtk4qmRyQ=;
h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=laj1aes/r9t+acQeogn5BR+f7rD0s17Ul++WC8Gu7Gl4TRuK5SWvOOyK2G5q35//k
5kQMjUfxAXANXWpp95j9PR7HI41Dcqp//4KKQrJ6okKrMmmYMOU3+MTKbsGR9IPPIs
bMxHbkdyd1huKOTYhZFY2jaz/XYWDdcvKx0S4DmU=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9957E3858020
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:mime-version:references:in-reply-to:from:date
:message-id:subject:to;
bh=KBMF8CA0TFmIGa4dQHYi2YhbyZolcTDODH9xaxV9zpY=;
b=uOEyWaOBrg5fTOsj0taPi3iD3OK+gBPHzDoousiAmJRaStV3u404LRZn1BUZlv8Qgw
R9kizIDoeqfc6Wtz6YyP3I7fu9nf9aqRLSmdYnisfEciBoTxKOY/b+ATYKpVqSIQ/+Kn
hHVhWOnIsg48Wpq5D3FMCl3mPvhyX7HVXpgA8jsUVhTS3mfsQytHTLnnFlGiDpTr82gT
Ugdz7jm4NPoiBSO6zzD9YFuh8WhlXR7eeGpws8Kcw05sjyxo7qG3EJM6Zhvzgea92h/A
xnFgcUSYey6wI25y1Wv6ISktaLeb3BUsPtnttP/cLzEe81WBEGPJ9nuigYBQ4DdYMDIW
TsPQ==
X-Gm-Message-State: AOAM531LW76AI268abvlLdL49UqDDE0OZICw+cGppG0gfgyvMHFX0ulP
VDpBW2YomEzFPPdImFkiXuilqkcrUVCXyfBz20T1Z7Br
X-Google-Smtp-Source: ABdhPJzKsgPM5h6NpBQ7mZGnlmlTNRee036LFh6b3pVhPd5Hkr1EFi7xt76QLUaaCo7csA82n0pyZqM6p+9qies31q0=
X-Received: by 2002:a37:392:: with SMTP id 140mr21652962qkd.236.1617571592233;
Sun, 04 Apr 2021 14:26:32 -0700 (PDT)
MIME-Version: 1.0
References: <d3342ff4-f717-f882-5c41-b27ab272dc03 AT cyberXpress DOT co DOT nz>
In-Reply-To: <d3342ff4-f717-f882-5c41-b27ab272dc03@cyberXpress.co.nz>
Date: Mon, 5 Apr 2021 06:26:18 +0900
Message-ID: <CAAr43iOdVea3YYThgdYpJxRCaVtFVhyHz_FwMTQhqTw8+YT-zg@mail.gmail.com>
Subject: Re: Perl Unidecode modules - which to use (if not Text::Unidecode)?
To: cygwin AT cygwin DOT com
X-Spam-Status: No, score=-0.4 required=5.0 tests=BAYES_00, BODY_8BITS,
DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM,
HTML_MESSAGE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
server2.sourceware.org
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Joel Rees via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Joel Rees <joel DOT rees AT gmail DOT com>
Sender: "Cygwin" <cygwin-bounces AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 134LQd3W001967

Erk.

Sorry for the feint, Mark.

CPAN is the perl way to get perl modules and such, but see below.

2021年4月2日(金) 5:38 Mark Aitchison <M DOT Aitchison AT cyberxpress DOT co DOT nz>:

> I am writing perl programs that I'd like to know will work under both
> Linux and Cygwin,
> and have to deal with Unicode now.
>
> I had used Text::Unidecode happily in Linux but find no cygwin version.
> Possibly I am not
> looking in the right places for it, but possibly there are different
> Unicode-related
> modules that are well-supported under both cygwin and linux that I should
> be using
> instead, and I guess Unicode might be one of those things where it depends
> on the
> underlying o/s so it probably pays to go with whatever is the standard set
> of modules.
>
> 1. What perl Unicode modules should I consider, if not Text::Unidecode?
> The present need
> is to be able to convert those few "foreign" characters (like
> ÇĆĈĊçĉċĜĞĠĢĝģğġËÌÍÎÏÒÓÔÕ)
> that are basically ASCII with accent marks to their closest ASCII
> equivalents, but I'd
> like to do more with Unicode in the future, without going down any
> dead-ends as far as
> being able to run under cygwin is concerned.
>

"Stripping those few foreign accent characters" is probably not really what
you want to do.

Those "accent characters" are misinterpreted foreign encoding (likely not
to be Unicode) characters. Simply "stripping" the "accent characters" will
basically convert them to truly meaningless junk. I suppose the meaningless
junk can then be interpreted by the reader as "used to be a be a foreign
word here", but why bother contributing further to information entropy?

2. I see some talk of Internationalization in Chapter 2 of "Setting up
> Cygwin", but
> cannot see anything relating to perl modules, and I don't see any easy way
> to search many
> months of the mailing list for a keyword... is there any information I
> should know about?


Have you read the perldoc on internationalization?
--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019