DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 49OHNUO94176132 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=Pj20WReT X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E21353858D28 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1729790608; bh=tNO+SAm4qtvw0aHAvel6l/OqP9UZdXL1IKZFIpRx2bw=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Pj20WReTQoKs928ZzqDZpWcCMYuNSNmdlKoGu26lsXmSkdBwJlMX776fxMPcNsun/ 9cSmaxxt4HLwG2YPmbKUYConsSHew+a2jA6lvsJggJJJLy/n/QZRK7uLjcUAo4hac8 7saEsSW3wjlkXfjO6KqupcfaYhvxtlSlwYuUn+iY= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DA0E03858405 ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DA0E03858405 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729790544; cv=none; b=YcBUPNTF5W+1/zBt9cy1fA4qcc1wPCJJDiQ9/iEn9p3B/kt38NIqyUaBap1rSEa/fDbWFIWuGxTwgZhWMu2O6k5f9pe6yiZxOt1aghyXJRXb+49E4y8wHpEkNKETTuFY+79GQbqpIrwTLhYbGE+IqCYxKMBFJ9+iFSZmC4rXmPE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729790544; c=relaxed/simple; bh=N44dAVWHMi3tN7ORE3HZbqetlUOIRDoR+ahBP9aff6o=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=mGDK3ZEsrCnTl+OMJd32jvB2PJd0UP2kIMBRku5OhYHd7Huvyo/fLu2t7CJZzoPWAbbdBDNZYQXpMQWa3MGryqMxbYau8IVuNIS7n6FZrTm+1QuXAVkjiJ+cL7wsL8besiQyRfjwn0C8ZXHzO1v9J8a56iOianP4q50B/YX4UzQ= ARC-Authentication-Results: i=1; server2.sourceware.org X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6 Message-ID: <439ae1d0-3b38-4553-b889-f0b344dfaf4a@towo.net> Date: Thu, 24 Oct 2024 19:22:18 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Is this correct behaviour for 'rev'? To: cygwin AT cygwin DOT com References: <6fdbf92d-51f2-47ae-a482-5edd89ed3a89 AT maxrnd DOT com> <429b4a7c-4a05-467c-a90d-6ed6e87cfc63 AT SystematicSW DOT ab DOT ca> Autocrypt: addr=towo AT towo DOT net; keydata= xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11 Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1 zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m 7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9 +AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve 5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72 5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4 uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ 0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW 6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ== In-Reply-To: <429b4a7c-4a05-467c-a90d-6ed6e87cfc63@SystematicSW.ab.ca> X-Provags-ID: V03:K1:MeOTZfAbWqW4LIwwjpLtMQo00olauphvMDvhxSEWPwwh89rCkhX SfLtoa4e4KKzlrJ4/GS+o3FnCm1xRAV6q/G7mG8rmZrwFwsh77jy2cCnqiqwZ1/5EmmAkQB AOfGSSqBNZdo5pA9i33TERchhri2JWbIIu50eyAtC9JUuTj/CDh1cZz1WxBGh6Madjp5cDL 8rZgyuwTkHtumKosaQ3mA== UI-OutboundReport: notjunk:1;M01:P0:3/fOjjdF4Lk=;XugwHbYia45PtxWX/6qThexoIta 9v8gvgVks39+MUWqETpN5hUHwPkvtNbgpt+iBqs5oAAFUcnQ61shcwqyiBvsVCp+LVljcQw0t iO690tIquRaOSNj3h9vTshvsGh6Pem1z/nvIAo0+3/3LVyqGNFSDkZPe243TwqMjNUt5lP5bf mnqSLm6Q54jVyo88GBrsc0Cx/CiwvREvtgYGu/zoMI5w9GJaOHiuPhtPCee6TQhpZr4Lw+Ue+ M6+V6T3bFY848dclEse+9NWG6dWIR1yd3xizRhtds+v6aH+pdPuTyswMZ4DKwAIZqqMAkecI6 978Kc+p4/3t8wl5yVDFWjIFP+za9mrxMRlm3AKvvTLcqrQZz0fd75vI4twEwKxEZpj/GmAaS5 fDwz9PtxE6RRN4JQmmUpc0DZZg2sjvwwXWVV/IYzN5ZBrNCKgshnRwGNA+0bonUCOfY8b0gWR mgARb5xVGXn5KTYKp263eDrMYvTauLxA0UyUVIR+UZNd7RRmYXKKCI91GrM+2tJAUy3nCS5CZ 3ZfJzYU6cY2vo/K8mmvWiyLQe2J+2AdO8ySHV1q40KIdaEQ6Ry0vR/hHt3lzILlPhLAV1PnKS wPrpoNoqjWEYhNDJCIGGB3CHLEqOx1b+SGDh5nOPkvRARIb5jTVRwkQseYRFjM9gSr15t36WF hB/jrRpCbq4F67tPaCLfMtuMieSKfY7TEMsRfWkjeRU1MoJkkkCfGETx+T3E2uvpW5zaO7EV0 sWTiQbqIMkUuqYhkqqTj4loJ2Z+E65v6Q== X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Thomas Wolff via Cygwin Reply-To: Thomas Wolff Content-Type: text/plain; charset="utf-8"; Format="flowed" Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 49OHNUO94176132 Am 24.10.2024 um 15:56 schrieb Brian Inglis via Cygwin: > On 2024-10-24 02:37, Thomas Wolff via Cygwin wrote: >> >> Am 24.10.2024 um 07:01 schrieb Mark Geisert via Cygwin: >>> Replying to myself, I continue... >>> >>> On 10/22/2024 10:33 PM, Mark Geisert via Cygwin wrote: >>>> On 10/22/2024 8:00 PM, Backwoods BC via Cygwin wrote: >>>>> It appears that 'rev' is choking on any character \x80 or higher, but >>>>> is OK with those \x1f or smaller. It doesn't give an error or ignore >>>>> it, it just stops. >>>>> >>>>> I don't have access to a Linux box so I can't see if this happens >>>>> there and nothing in the documentation suggests that this is the >>>>> correct functionality. >>>>> >>>>> Test case: >>>>> printf 'no non-ASCII characters\nhex 01 >\x01< here\nhex 80 >\x80< >>>>> here\nLine 4\n'|rev|rev >>>>> >>>>> This is for "rev from util-linux 2.33.1" >>>>> >>>>> I don't have the current version of 'rev' on my system due to not >>>>> having updated in a while. I accidentally screwed up my installation >>>>> and have been reluctant to wipe it and start over. >>>>> >>>>> So, is this the expected behaviour for the current version of 'rev' >>>>> under Cygwin and/or Linux? >>>> >>>> The current Cygwin util-linux 2.39.3-2 rev behaves in the same, >>>> broken way.  It looks like line-ending char(s) are not being handled >>>> correctly.   Don't know yet if it's rev itself or fgetws() being used >>>> by rev that's busted.  I'll investigate further.  Thanks for the >>>> report! >>> >>> This is a locale issue.  In the default Cygwin locale, rev mishandles >>> the \x80 byte and instead of stopping with an error message it enters >>> an infinite loop.  I'll probably report this upstream instead of >>> working out a local fix. >>> >>> There is a work-around: change to the "C" locale just to run rev. >>>     LC_ALL=C rev zzz >>> where zzz is a file containing your four lines.  You can also run your >>> original testcase with "rev" replaced by "LC_ALL=C rev" in both places. >> Sorry, this is not a good workaround as it corrupts all (proper) >> non-ASCII characters. >> You could do e.g. >> grep . | rev > > Not quite, as that just matches non-empty lines, you would have to do > something more like `grep -o . ...`, but not sure that would do what > you want either. > Ah, right, so: egrep -e "(^$|.)" | rev or maybe there is some more suitable tool. > The correct approach should be to match the execution locale to the > file locale, for example, `LC_ALL=...UTF-8 rev ...` which should > produce the expected results. That's not the point. You can never be sure that there is no stray wrong-encoded byte in your files, and rev should definitely not endless-loop in that case. -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple