X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6132F388C032 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1617662398; bh=dHq1nYK4X3GnbLdPpd1YILvP73I2cT0wharNjodj5P0=; h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=YWCk9VX1T6TBeSwUL46tUO+a3g3HLyz/39xORTX12+/6JtaJk3LtsEXE7z8JlSoyL ad/oTPTA6FsD7xIqoz27bNa7DC3fv5s5PpKUXJurrdJr+4eN+d99r3Y/yAJuCR5G6a shTyJvrTq/odKYiwbGQHUFT1BsE4uoINgDKAynSA= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org ADC343846403 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=Ay46eOjnuFEl0tWtzPpMkob08iKCehSFYlhHgvf3OwM=; b=qzZWtY0faelpSsDJSS9QcpEpxfQLNxTlgGRgyz0GgQl+uAUHu/C+F7k/lWAxO2evtb oJZBh/yTOQYAU70VjllLlqxifFE49XDG3xyfSqJcqDRzW2og5EGGOmdm+R+zYZM7SnZF YAeS6UVv7htkICj3707nTfImT3pabQfCCGmVBkoYObjLEljMeBCpdmnMp1uMN5y8e36a XMUxwsgfKWDKefNR32MZWz3DDOXbg+/4FR+arBRcMlWoxGuo/ZlSmVpHEQdCZjzoyt9T 48qHUxvOZVvMLdHRYmlPxvc4zudhy5OVjwbrWwXdk2LqZKZN365VUcLCr+aRgR8ar/fH Z2Rg== X-Gm-Message-State: AOAM533wd3BQ6RRhbk58Au1MvXntdtc4XvhagXsZCa5EtyuuUAZObqKA 4KDWE9jpB/PyZBrlpRN6eILvM2jXzxKVNnmSVMns8ogm X-Google-Smtp-Source: ABdhPJx4xLpy+PHhkumO8Uu6JdwqSgAPKAA3hoM0wRWLFSrC8K5EhStvCr8pyaYpnhv4zvxY4Z+BmFlEyxd4hVU1Dno= X-Received: by 2002:a05:622a:18b:: with SMTP id s11mr12073309qtw.26.1617662394307; Mon, 05 Apr 2021 15:39:54 -0700 (PDT) MIME-Version: 1.0 References: <606AD7CE DOT 6090606 AT tlinx DOT org> In-Reply-To: Date: Tue, 6 Apr 2021 07:39:42 +0900 Message-ID: Subject: Re: Perl Unidecode modules - which to use (if not Text::Unidecode)? To: cygwin AT cygwin DOT com X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.29 List-Id: General Cygwin discussions and problem reports List-Archive: List-Post: List-Help: List-Subscribe: , From: Joel Rees via Cygwin Reply-To: Joel Rees Content-Type: text/plain; charset="utf-8" Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 135Me1eX022958 Well, in the following, are your plans cognizant of the fact that many major languages do not incorporate a partition between vowels and consonants? Do you plan to target only those languages which do? 2021年4月6日(火) 6:50 Mark Aitchison : > > A little more detail... I realise that stripping accents off is often not > a good thing to do, but at the moment that basically is what I'm after, or > to be more specific: I want to know if the character is a consonant or > vowel... I basically throw away vowels and punctuation in this odd > application. Later I will want to do all sorts of things with input text > that might be utf8 or utf16 or some encoding that (hopefully) I can guess > and translate to the same standard and ultimately spit out on a web page. > > There seem to be many perl modules that do similar things... I want to be > able to distribute my code and not require people to download things from > cpan. I'd like to stick with modules that are as stock standard as standard > can be, i.e. are in a standard cygwin distribution, and are normally found > in other perl environments. In a sense, searching cpan gives me too many > options because that includes modules that might require a customer to do > more than I should ask them to have to do, if it could have been avoided by > me choosing a more standard way of achieving the goal in the first place. > > What I probably should have asked is... > 1. What perl module, that comes with cygwin, is good for telling whether a > letter is a consonant? > 2. Later on I will also need something that makes a reasonable guess as to > what kind of encoding is used in some text (that might not have a helpful > header telling me the answer), with the view to converting it to whatever > encoding I want? I can find software to do this, but I would like to > restrict options to just those a cygwin user can install with the setup > program... if I'm not being too unrealistic about that requirement. > Thanks, Mark > > -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple