DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 49OHNUO94176132
Authentication-Results: delorie.com;
	dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=Pj20WReT
X-Recipient: archive-cygwin@delorie.com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E21353858D28
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
	s=default; t=1729790608;
	bh=tNO+SAm4qtvw0aHAvel6l/OqP9UZdXL1IKZFIpRx2bw=;
	h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
	 List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
	 From;
	b=Pj20WReTQoKs928ZzqDZpWcCMYuNSNmdlKoGu26lsXmSkdBwJlMX776fxMPcNsun/
	 9cSmaxxt4HLwG2YPmbKUYConsSHew+a2jA6lvsJggJJJLy/n/QZRK7uLjcUAo4hac8
	 7saEsSW3wjlkXfjO6KqupcfaYhvxtlSlwYuUn+iY=
X-Original-To: cygwin@cygwin.com
Delivered-To: cygwin@cygwin.com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DA0E03858405
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DA0E03858405
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729790544; cv=none;
 b=YcBUPNTF5W+1/zBt9cy1fA4qcc1wPCJJDiQ9/iEn9p3B/kt38NIqyUaBap1rSEa/fDbWFIWuGxTwgZhWMu2O6k5f9pe6yiZxOt1aghyXJRXb+49E4y8wHpEkNKETTuFY+79GQbqpIrwTLhYbGE+IqCYxKMBFJ9+iFSZmC4rXmPE=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1729790544; c=relaxed/simple;
 bh=N44dAVWHMi3tN7ORE3HZbqetlUOIRDoR+ahBP9aff6o=;
 h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From;
 b=mGDK3ZEsrCnTl+OMJd32jvB2PJd0UP2kIMBRku5OhYHd7Huvyo/fLu2t7CJZzoPWAbbdBDNZYQXpMQWa3MGryqMxbYau8IVuNIS7n6FZrTm+1QuXAVkjiJ+cL7wsL8besiQyRfjwn0C8ZXHzO1v9J8a56iOianP4q50B/YX4UzQ=
ARC-Authentication-Results: i=1; server2.sourceware.org
X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6
Message-ID: <439ae1d0-3b38-4553-b889-f0b344dfaf4a@towo.net>
Date: Thu, 24 Oct 2024 19:22:18 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: Is this correct behaviour for 'rev'?
To: cygwin@cygwin.com
References: <CAKwdsS9FGm9nqtZ+vSQ+WEWzRf-zUFAS06eo=ASwNB6ST3gddw@mail.gmail.com>
 <6fdbf92d-51f2-47ae-a482-5edd89ed3a89@maxrnd.com>
 <f58d4a6c-476d-4cc5-bad5-28c99ad75c2b@maxrnd.com>
 <f6930a61-eed4-4a06-a813-6e0ea1914a13@towo.net>
 <429b4a7c-4a05-467c-a90d-6ed6e87cfc63@SystematicSW.ab.ca>
Autocrypt: addr=towo@towo.net; keydata=
 xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu
 KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ
 bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK
 fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11
 Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1
 zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN
 BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m
 7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB
 BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID
 AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9
 +AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve
 5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY
 GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72
 5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B
 I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4
 uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka
 UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded
 KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc
 k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te
 D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD
 G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ
 0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk
 UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P
 agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA
 IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF
 mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW
 6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F
 PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa
 QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF
 FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP
 y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP
 iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ==
In-Reply-To: <429b4a7c-4a05-467c-a90d-6ed6e87cfc63@SystematicSW.ab.ca>
X-Provags-ID: V03:K1:MeOTZfAbWqW4LIwwjpLtMQo00olauphvMDvhxSEWPwwh89rCkhX
 SfLtoa4e4KKzlrJ4/GS+o3FnCm1xRAV6q/G7mG8rmZrwFwsh77jy2cCnqiqwZ1/5EmmAkQB
 AOfGSSqBNZdo5pA9i33TERchhri2JWbIIu50eyAtC9JUuTj/CDh1cZz1WxBGh6Madjp5cDL
 8rZgyuwTkHtumKosaQ3mA==
UI-OutboundReport: notjunk:1;M01:P0:3/fOjjdF4Lk=;XugwHbYia45PtxWX/6qThexoIta
 9v8gvgVks39+MUWqETpN5hUHwPkvtNbgpt+iBqs5oAAFUcnQ61shcwqyiBvsVCp+LVljcQw0t
 iO690tIquRaOSNj3h9vTshvsGh6Pem1z/nvIAo0+3/3LVyqGNFSDkZPe243TwqMjNUt5lP5bf
 mnqSLm6Q54jVyo88GBrsc0Cx/CiwvREvtgYGu/zoMI5w9GJaOHiuPhtPCee6TQhpZr4Lw+Ue+
 M6+V6T3bFY848dclEse+9NWG6dWIR1yd3xizRhtds+v6aH+pdPuTyswMZ4DKwAIZqqMAkecI6
 978Kc+p4/3t8wl5yVDFWjIFP+za9mrxMRlm3AKvvTLcqrQZz0fd75vI4twEwKxEZpj/GmAaS5
 fDwz9PtxE6RRN4JQmmUpc0DZZg2sjvwwXWVV/IYzN5ZBrNCKgshnRwGNA+0bonUCOfY8b0gWR
 mgARb5xVGXn5KTYKp263eDrMYvTauLxA0UyUVIR+UZNd7RRmYXKKCI91GrM+2tJAUy3nCS5CZ
 3ZfJzYU6cY2vo/K8mmvWiyLQe2J+2AdO8ySHV1q40KIdaEQ6Ry0vR/hHt3lzILlPhLAV1PnKS
 wPrpoNoqjWEYhNDJCIGGB3CHLEqOx1b+SGDh5nOPkvRARIb5jTVRwkQseYRFjM9gSr15t36WF
 hB/jrRpCbq4F67tPaCLfMtuMieSKfY7TEMsRfWkjeRU1MoJkkkCfGETx+T3E2uvpW5zaO7EV0
 sWTiQbqIMkUuqYhkqqTj4loJ2Z+E65v6Q==
X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE,
 RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: cygwin@cygwin.com
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-request@cygwin.com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=subscribe>
From: Thomas Wolff via Cygwin <cygwin@cygwin.com>
Reply-To: Thomas Wolff <towo@towo.net>
Content-Type: text/plain; charset="utf-8"; Format="flowed"
Errors-To: cygwin-bounces~archive-cygwin=delorie.com@cygwin.com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie.com@cygwin.com>
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 49OHNUO94176132




Am 24.10.2024 um 15:56 schrieb Brian Inglis via Cygwin:
> On 2024-10-24 02:37, Thomas Wolff via Cygwin wrote:
>>
>> Am 24.10.2024 um 07:01 schrieb Mark Geisert via Cygwin:
>>> Replying to myself, I continue...
>>>
>>> On 10/22/2024 10:33 PM, Mark Geisert via Cygwin wrote:
>>>> On 10/22/2024 8:00 PM, Backwoods BC via Cygwin wrote:
>>>>> It appears that 'rev' is choking on any character \x80 or higher, but
>>>>> is OK with those \x1f or smaller. It doesn't give an error or ignore
>>>>> it, it just stops.
>>>>>
>>>>> I don't have access to a Linux box so I can't see if this happens
>>>>> there and nothing in the documentation suggests that this is the
>>>>> correct functionality.
>>>>>
>>>>> Test case:
>>>>> printf 'no non-ASCII characters\nhex 01 >\x01< here\nhex 80 >\x80<
>>>>> here\nLine 4\n'|rev|rev
>>>>>
>>>>> This is for "rev from util-linux 2.33.1"
>>>>>
>>>>> I don't have the current version of 'rev' on my system due to not
>>>>> having updated in a while. I accidentally screwed up my installation
>>>>> and have been reluctant to wipe it and start over.
>>>>>
>>>>> So, is this the expected behaviour for the current version of 'rev'
>>>>> under Cygwin and/or Linux?
>>>>
>>>> The current Cygwin util-linux 2.39.3-2 rev behaves in the same,
>>>> broken way.  It looks like line-ending char(s) are not being handled
>>>> correctly.   Don't know yet if it's rev itself or fgetws() being used
>>>> by rev that's busted.  I'll investigate further.  Thanks for the
>>>> report!
>>>
>>> This is a locale issue.  In the default Cygwin locale, rev mishandles
>>> the \x80 byte and instead of stopping with an error message it enters
>>> an infinite loop.  I'll probably report this upstream instead of
>>> working out a local fix.
>>>
>>> There is a work-around: change to the "C" locale just to run rev.
>>>     LC_ALL=C rev zzz
>>> where zzz is a file containing your four lines.  You can also run your
>>> original testcase with "rev" replaced by "LC_ALL=C rev" in both places.
>> Sorry, this is not a good workaround as it corrupts all (proper)
>> non-ASCII characters.
>> You could do e.g.
>> grep . | rev
>
> Not quite, as that just matches non-empty lines, you would have to do
> something more like `grep -o . ...`, but not sure that would do what
> you want either.
>
Ah, right, so:
egrep -e "(^$|.)" | rev
or maybe there is some more suitable tool.

> The correct approach should be to match the execution locale to the
> file locale, for example, `LC_ALL=...UTF-8 rev ...` which should
> produce the expected results.
That's not the point. You can never be sure that there is no stray
wrong-encoded byte in your files, and rev should definitely not
endless-loop in that case.


-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

