delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2021/04/05/18:40:01

X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6132F388C032
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1617662398;
bh=dHq1nYK4X3GnbLdPpd1YILvP73I2cT0wharNjodj5P0=;
h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=YWCk9VX1T6TBeSwUL46tUO+a3g3HLyz/39xORTX12+/6JtaJk3LtsEXE7z8JlSoyL
ad/oTPTA6FsD7xIqoz27bNa7DC3fv5s5PpKUXJurrdJr+4eN+d99r3Y/yAJuCR5G6a
shTyJvrTq/odKYiwbGQHUFT1BsE4uoINgDKAynSA=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org ADC343846403
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:mime-version:references:in-reply-to:from:date
:message-id:subject:to;
bh=Ay46eOjnuFEl0tWtzPpMkob08iKCehSFYlhHgvf3OwM=;
b=qzZWtY0faelpSsDJSS9QcpEpxfQLNxTlgGRgyz0GgQl+uAUHu/C+F7k/lWAxO2evtb
oJZBh/yTOQYAU70VjllLlqxifFE49XDG3xyfSqJcqDRzW2og5EGGOmdm+R+zYZM7SnZF
YAeS6UVv7htkICj3707nTfImT3pabQfCCGmVBkoYObjLEljMeBCpdmnMp1uMN5y8e36a
XMUxwsgfKWDKefNR32MZWz3DDOXbg+/4FR+arBRcMlWoxGuo/ZlSmVpHEQdCZjzoyt9T
48qHUxvOZVvMLdHRYmlPxvc4zudhy5OVjwbrWwXdk2LqZKZN365VUcLCr+aRgR8ar/fH
Z2Rg==
X-Gm-Message-State: AOAM533wd3BQ6RRhbk58Au1MvXntdtc4XvhagXsZCa5EtyuuUAZObqKA
4KDWE9jpB/PyZBrlpRN6eILvM2jXzxKVNnmSVMns8ogm
X-Google-Smtp-Source: ABdhPJx4xLpy+PHhkumO8Uu6JdwqSgAPKAA3hoM0wRWLFSrC8K5EhStvCr8pyaYpnhv4zvxY4Z+BmFlEyxd4hVU1Dno=
X-Received: by 2002:a05:622a:18b:: with SMTP id
s11mr12073309qtw.26.1617662394307;
Mon, 05 Apr 2021 15:39:54 -0700 (PDT)
MIME-Version: 1.0
References: <d3342ff4-f717-f882-5c41-b27ab272dc03 AT cyberXpress DOT co DOT nz>
<CAAr43iOdVea3YYThgdYpJxRCaVtFVhyHz_FwMTQhqTw8+YT-zg AT mail DOT gmail DOT com>
<606AD7CE DOT 6090606 AT tlinx DOT org>
<CAAr43iMuc3LRxy=BqJJuZTkzU14c+XERMv2oVVc7Lg-kuMY5BQ AT mail DOT gmail DOT com>
<abb3cba3-2d64-4ffd-bedb-e63df3f34439 AT cyberxpress DOT co DOT nz>
In-Reply-To: <abb3cba3-2d64-4ffd-bedb-e63df3f34439@cyberxpress.co.nz>
Date: Tue, 6 Apr 2021 07:39:42 +0900
Message-ID: <CAAr43iMe7SV12t_95H1GxsZ1iye2v80Mb3=3zi_UtzLcM_j4fA@mail.gmail.com>
Subject: Re: Perl Unidecode modules - which to use (if not Text::Unidecode)?
To: cygwin AT cygwin DOT com
X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00, DKIM_SIGNED,
DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE,
RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
server2.sourceware.org
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.29
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Joel Rees via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Joel Rees <joel DOT rees AT gmail DOT com>
Sender: "Cygwin" <cygwin-bounces AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 135Me1eX022958

Well, in the following, are your plans cognizant of the fact that many
major languages do not incorporate a partition between vowels and
consonants?

Do you plan to target only those languages which do?

2021年4月6日(火) 6:50 Mark Aitchison <mark DOT aitchison AT cyberxpress DOT co DOT nz>:

>
> A little more detail... I realise that stripping accents off is often not
> a good thing to do, but at the moment that basically is what I'm after, or
> to be more specific: I want to know if the character is a consonant or
> vowel... I basically throw away vowels and punctuation in this odd
> application. Later I will want to do all sorts of things with input text
> that might be utf8 or utf16 or some encoding that (hopefully) I can guess
> and translate to the same standard and ultimately spit out on a web page.
>
> There seem to be many perl modules that do similar things... I want to be
> able to distribute my code and not require people to download things from
> cpan. I'd like to stick with modules that are as stock standard as standard
> can be, i.e. are in a standard cygwin distribution, and are normally found
> in other perl environments. In a sense, searching cpan gives me too many
> options because that includes modules that might require a customer to do
> more than I should ask them to have to do, if it could have been avoided by
> me choosing a more standard way of achieving the goal in the first place.
>
> What I probably should have asked is...
> 1. What perl module, that comes with cygwin, is good for telling whether a
> letter is a consonant?
> 2. Later on I will also need something that makes a reasonable guess as to
> what kind of encoding is used in some text (that might not have a helpful
> header telling me the answer), with the view to converting it to whatever
> encoding I want? I can find software to do this, but I would like to
> restrict options to just those a cygwin user can install with the setup
> program... if I'm not being too unrealistic about that requirement.
> Thanks, Mark
>
>
--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019