delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2024/10/24/13:23:31

DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 49OHNUO94176132
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=Pj20WReT
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E21353858D28
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1729790608;
bh=tNO+SAm4qtvw0aHAvel6l/OqP9UZdXL1IKZFIpRx2bw=;
h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=Pj20WReTQoKs928ZzqDZpWcCMYuNSNmdlKoGu26lsXmSkdBwJlMX776fxMPcNsun/
9cSmaxxt4HLwG2YPmbKUYConsSHew+a2jA6lvsJggJJJLy/n/QZRK7uLjcUAo4hac8
7saEsSW3wjlkXfjO6KqupcfaYhvxtlSlwYuUn+iY=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DA0E03858405
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DA0E03858405
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729790544; cv=none;
b=YcBUPNTF5W+1/zBt9cy1fA4qcc1wPCJJDiQ9/iEn9p3B/kt38NIqyUaBap1rSEa/fDbWFIWuGxTwgZhWMu2O6k5f9pe6yiZxOt1aghyXJRXb+49E4y8wHpEkNKETTuFY+79GQbqpIrwTLhYbGE+IqCYxKMBFJ9+iFSZmC4rXmPE=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1729790544; c=relaxed/simple;
bh=N44dAVWHMi3tN7ORE3HZbqetlUOIRDoR+ahBP9aff6o=;
h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From;
b=mGDK3ZEsrCnTl+OMJd32jvB2PJd0UP2kIMBRku5OhYHd7Huvyo/fLu2t7CJZzoPWAbbdBDNZYQXpMQWa3MGryqMxbYau8IVuNIS7n6FZrTm+1QuXAVkjiJ+cL7wsL8besiQyRfjwn0C8ZXHzO1v9J8a56iOianP4q50B/YX4UzQ=
ARC-Authentication-Results: i=1; server2.sourceware.org
X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6
Message-ID: <439ae1d0-3b38-4553-b889-f0b344dfaf4a@towo.net>
Date: Thu, 24 Oct 2024 19:22:18 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: Is this correct behaviour for 'rev'?
To: cygwin AT cygwin DOT com
References: <CAKwdsS9FGm9nqtZ+vSQ+WEWzRf-zUFAS06eo=ASwNB6ST3gddw AT mail DOT gmail DOT com>
<6fdbf92d-51f2-47ae-a482-5edd89ed3a89 AT maxrnd DOT com>
<f58d4a6c-476d-4cc5-bad5-28c99ad75c2b AT maxrnd DOT com>
<f6930a61-eed4-4a06-a813-6e0ea1914a13 AT towo DOT net>
<429b4a7c-4a05-467c-a90d-6ed6e87cfc63 AT SystematicSW DOT ab DOT ca>
Autocrypt: addr=towo AT towo DOT net; keydata=
xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu
KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ
bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK
fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11
Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1
zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN
BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m
7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB
BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID
AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9
+AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve
5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY
GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72
5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B
I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4
uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka
UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded
KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc
k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te
D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD
G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ
0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk
UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P
agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA
IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF
mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW
6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F
PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa
QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF
FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP
y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP
iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ==
In-Reply-To: <429b4a7c-4a05-467c-a90d-6ed6e87cfc63@SystematicSW.ab.ca>
X-Provags-ID: V03:K1:MeOTZfAbWqW4LIwwjpLtMQo00olauphvMDvhxSEWPwwh89rCkhX
SfLtoa4e4KKzlrJ4/GS+o3FnCm1xRAV6q/G7mG8rmZrwFwsh77jy2cCnqiqwZ1/5EmmAkQB
AOfGSSqBNZdo5pA9i33TERchhri2JWbIIu50eyAtC9JUuTj/CDh1cZz1WxBGh6Madjp5cDL
8rZgyuwTkHtumKosaQ3mA==
UI-OutboundReport: notjunk:1;M01:P0:3/fOjjdF4Lk=;XugwHbYia45PtxWX/6qThexoIta
9v8gvgVks39+MUWqETpN5hUHwPkvtNbgpt+iBqs5oAAFUcnQ61shcwqyiBvsVCp+LVljcQw0t
iO690tIquRaOSNj3h9vTshvsGh6Pem1z/nvIAo0+3/3LVyqGNFSDkZPe243TwqMjNUt5lP5bf
mnqSLm6Q54jVyo88GBrsc0Cx/CiwvREvtgYGu/zoMI5w9GJaOHiuPhtPCee6TQhpZr4Lw+Ue+
M6+V6T3bFY848dclEse+9NWG6dWIR1yd3xizRhtds+v6aH+pdPuTyswMZ4DKwAIZqqMAkecI6
978Kc+p4/3t8wl5yVDFWjIFP+za9mrxMRlm3AKvvTLcqrQZz0fd75vI4twEwKxEZpj/GmAaS5
fDwz9PtxE6RRN4JQmmUpc0DZZg2sjvwwXWVV/IYzN5ZBrNCKgshnRwGNA+0bonUCOfY8b0gWR
mgARb5xVGXn5KTYKp263eDrMYvTauLxA0UyUVIR+UZNd7RRmYXKKCI91GrM+2tJAUy3nCS5CZ
3ZfJzYU6cY2vo/K8mmvWiyLQe2J+2AdO8ySHV1q40KIdaEQ6Ry0vR/hHt3lzILlPhLAV1PnKS
wPrpoNoqjWEYhNDJCIGGB3CHLEqOx1b+SGDh5nOPkvRARIb5jTVRwkQseYRFjM9gSr15t36WF
hB/jrRpCbq4F67tPaCLfMtuMieSKfY7TEMsRfWkjeRU1MoJkkkCfGETx+T3E2uvpW5zaO7EV0
sWTiQbqIMkUuqYhkqqTj4loJ2Z+E65v6Q==
X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00, DKIM_SIGNED,
DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE,
RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS,
TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Thomas Wolff via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Thomas Wolff <towo AT towo DOT net>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 49OHNUO94176132



Am 24.10.2024 um 15:56 schrieb Brian Inglis via Cygwin:
> On 2024-10-24 02:37, Thomas Wolff via Cygwin wrote:
>>
>> Am 24.10.2024 um 07:01 schrieb Mark Geisert via Cygwin:
>>> Replying to myself, I continue...
>>>
>>> On 10/22/2024 10:33 PM, Mark Geisert via Cygwin wrote:
>>>> On 10/22/2024 8:00 PM, Backwoods BC via Cygwin wrote:
>>>>> It appears that 'rev' is choking on any character \x80 or higher, but
>>>>> is OK with those \x1f or smaller. It doesn't give an error or ignore
>>>>> it, it just stops.
>>>>>
>>>>> I don't have access to a Linux box so I can't see if this happens
>>>>> there and nothing in the documentation suggests that this is the
>>>>> correct functionality.
>>>>>
>>>>> Test case:
>>>>> printf 'no non-ASCII characters\nhex 01 >\x01< here\nhex 80 >\x80<
>>>>> here\nLine 4\n'|rev|rev
>>>>>
>>>>> This is for "rev from util-linux 2.33.1"
>>>>>
>>>>> I don't have the current version of 'rev' on my system due to not
>>>>> having updated in a while. I accidentally screwed up my installation
>>>>> and have been reluctant to wipe it and start over.
>>>>>
>>>>> So, is this the expected behaviour for the current version of 'rev'
>>>>> under Cygwin and/or Linux?
>>>>
>>>> The current Cygwin util-linux 2.39.3-2 rev behaves in the same,
>>>> broken way.  It looks like line-ending char(s) are not being handled
>>>> correctly.   Don't know yet if it's rev itself or fgetws() being used
>>>> by rev that's busted.  I'll investigate further.  Thanks for the
>>>> report!
>>>
>>> This is a locale issue.  In the default Cygwin locale, rev mishandles
>>> the \x80 byte and instead of stopping with an error message it enters
>>> an infinite loop.  I'll probably report this upstream instead of
>>> working out a local fix.
>>>
>>> There is a work-around: change to the "C" locale just to run rev.
>>>     LC_ALL=C rev zzz
>>> where zzz is a file containing your four lines.  You can also run your
>>> original testcase with "rev" replaced by "LC_ALL=C rev" in both places.
>> Sorry, this is not a good workaround as it corrupts all (proper)
>> non-ASCII characters.
>> You could do e.g.
>> grep . | rev
>
> Not quite, as that just matches non-empty lines, you would have to do
> something more like `grep -o . ...`, but not sure that would do what
> you want either.
>
Ah, right, so:
egrep -e "(^$|.)" | rev
or maybe there is some more suitable tool.

> The correct approach should be to match the execution locale to the
> file locale, for example, `LC_ALL=...UTF-8 rev ...` which should
> produce the expected results.
That's not the point. You can never be sure that there is no stray
wrong-encoded byte in your files, and rev should definitely not
endless-loop in that case.


-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019