X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 101F03858284 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1676469165; bh=uTfjUM7Pepen9udbrEC+5AVCqTrMwvCxT/9bXzcFEEE=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=CxP7BXfoVq0HJ6w9hXFOh0CYSo9F72fcQZZ0mo1vdccFY49c8N17rEeJzmtyg3B5l JhWC75KVd6Kt4vRrDVeJbriu5OTGUk75qXmdQNYIHsyDeE9P6+c8P8q9lkulGZNGFt aRDlNpxpQbHIVW1jSBHFnHVZ8VNKbGcYKw4YZ4HM= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com Date: Wed, 15 Feb 2023 14:52:23 +0100 To: cygwin AT cygwin DOT com Subject: Re: [ANNOUNCEMENT] Updated: dash 0.5.12-2 Message-ID: Mail-Followup-To: cygwin AT cygwin DOT com, Brian Inglis References: <6810586169 DOT 20230213204858 AT yandex DOT ru> <8a583e14-b413-d1a2-35d9-e76f73a4b338 AT Shaw DOT ca> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Provags-ID: V03:K1:RLNiTs5iRAnO+JihjdF343de7+0HbSsn0BpYLjDrjfsX4pxVDsk S6BxsmWBe3CzSzCrXsSN0Nl3PotWw2dj6+4zbYvdP0si/jQttzwYyXOF9++lHxfDPskzc9V FTKdwhKzWE/TQS2JXy8GINattGphi0ZhxKoILvvLrpUKywEUe5WTO/k8OyS6tFheVi4aqFp wj0F/fpDOG+xBF3bafSZA== UI-OutboundReport: notjunk:1;M01:P0:xRkfNTXmPSw=;WhKwHoiIb4qJ6iNnFz3+gW/F5bD j2FDZrnnl2UtUlvH0Pt7uJM/0jSUTYlCJdTa4JVYKOkM3iTw7PyoNEsnXGmcxIdzdqz38qX32 1JxLMAJEUUMJ9747KGlRRzJjTa0f3HUzNrY/s5E+7rGxSF+ru0HZGqxvjMlks2XIpwKT63uCY bxV1JU3YVWQo02qDMNn2aRTdJ/U232BX77CTli+BWn6VSUNtG76U4xkC3Kp8NnFo7+fl0mkpo l9KxLEQbAHvsis2lvu1b7xqCkJIx7Q9r5BOgdvZN39K76PVYUkxOSUI+0rkNua4hnqrWoDvdn q5mFm/GPdMAPBYLYzwppWU0C0wlzuE+Mw31fIyN3CoyA2a5Y9zZjsh99cglGC/VrQtTHGj0N5 A6DnaQDpHJpqcVLgDv6jpHCqfnpxz9G9Yh0Wz/JAPt2yAeDsKzAnYyv0F4QR8vGNXo6Uie3v/ 8pE+38L+5aenTfIxbjygdQGyzUdlhBTpmX922HpL4gA2f94BE0vDvbrHGrt7LSb/YwmZD/sZ+ TMn77XcPGjL8UnuggjzLIguRZMAmRpvvZt8xGn9rC/S1iKmMuFKH0Sdj29fpCrq3MPsaJrlPN V0gLMXTyAgehD/ojcL28LNxmCdU4qT0/v9IQcy/HXnop3OcJqXiN60q8GH02dEARB8HVpwhkK 4fie1APRmm9o6VnkBzDUh79b+avsPLdUp+5dbEsk0Q== X-Spam-Status: No, score=-97.1 required=5.0 tests=BAYES_00, GOOD_FROM_CORINNA_CYGWIN, KAM_DMARC_STATUS, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_FAIL, SPF_HELO_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Corinna Vinschen via Cygwin Reply-To: cygwin AT cygwin DOT com Cc: Corinna Vinschen , Brian Inglis Content-Type: text/plain; charset="utf-8" Errors-To: cygwin-bounces+archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 31FDr8eI014828 Hi Brian, On Feb 13 20:37, Corinna Vinschen via Cygwin wrote: > On Feb 13 12:03, Brian Inglis via Cygwin wrote: > > On 2023-02-13 10:43, ASSI via Cygwin wrote: > > > Corinna Vinschen via Cygwin writes: > > > > Can you give me an example? I'm a bit puzzled because fnmatch as well > > > > as glob in Cygwin support native characters. > > > > But not locale dependent named character classes like regexp in paths. > > I checked the dash code of curent dash git, and while its internal glob > implementation supports character classes, they are no localized, using > standard singlebyte functions isalnum, isalpha, etc. under the hood. > > So, yeah, what you say further down this mail... looks like dash > supports locale dependent character classes only with glibc. > [...] > Either way, I don't care much for what a certain application provides by > itself. I'm talking about our libc, that is Cygwin, and what it > provides to processes calling its implementations of regcomp/regexec, > glob and fnmatch. > > All these functions have been taken from FreeBSD and all three suffer > shortcomings: > > - regcomp/regexec supports POSIX named character classes, collating > symbols, and equivalence class expressions, but all of them only work > for ASCII chars. > > - fnmatch and glob support neither of named character classes, > collating symbols, and equivalence class expressions. > > I checked the upstream code in FreeBSD, OpenBSD and NetBSD and none of > these functions are improved to support locales (regcomp) or any of > the character classes stuff (fnmatch/glob). > > So, if we want to add this support to Cygwin (and thus, to all > applications calling the libc implementation of these functions), > quite a bit of work is required. > > Being able to fetch the implementation from some other source > would reduce the effort enourmously :} I took the liberty to add [::] support to Cygwin's fnmatch(3) and glob(3) functions. They also recognize collating symbols [.=]. But the latter two are not implemented yet and fnmatch/glob simply skip them in the pattern. Given that glob and fnmatch use wide characters internally, the support for character classes is internationalized by default, albeit in a slightly differentt way than in glibc. The classes a unicode character belongs to is not locale dependent in Cygwin/newlib. All characters have their classes assigned all the time, so, for instance, the german character 'รค' is lower and alpha even in the en_US.utf8 locale. The currently building cygwin test release 3.5.0-0.174.gd6d4436145b8 contains the new code. Would you mind to build a dash for testing so we can see if and how it works? Thanks, Corinna -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple