delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2024/11/04/07:32:59

DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 4A4CWxLL1386430
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=vAm5DZS2
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 54A4D3857B9F
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1730723577;
bh=NUU7fb9WhQ+Fr9L4CjjuOdMR5skhHhxmzlzDei3fBTk=;
h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=vAm5DZS2Mxz+qvhI0GpQTlr+O03FUs/JwyDRG8mPpZbEnVCjFI4ddlDbipqsRCc4y
HhLTqqodweryurg+5l60E3wsfkJX5ij0R/rara5ZVcm68RdXVM/qS8qUOPPfVGXxpx
bAhOd/brCHY/pphdyk9vJJW3x5WroJjuGCTXNVf0=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A26E3385842D
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A26E3385842D
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730723517; cv=none;
b=Fmdi5fXGwUkXeblcbpKhTa5B4Bs0BftZZz/6njWBZXhZOnCga7Y38ufgEwVRHZFNew5x7dPB7V+b4IUw6w28aoZfIhG3jexYjEJpUaUQDpOswwsZBCikS/WzcpMAf/bGuWCKP4FzTkkg/1VvTSpEkzX+khCEmkImpEk9+49azaw=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1730723517; c=relaxed/simple;
bh=h17jH5HBWFlr4jBzyC/DYgVRSezmYMGe42/5T0fISXE=;
h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From;
b=Qts47+Wpm8ghouuxRdsWB0+QD366jSuctns7AbKy6Vh+QfFGzu0COjADt2++TwpysD6sFROfyOrmDYmVvQqGDcqjR8KwmyqdrDI+7E+cJKdts0zUqT2uHwk1DYl4AaTHV/3xqLSso5H3Ozx6kM5GQnexpLZan69BwMfo+q59wnY=
ARC-Authentication-Results: i=1; server2.sourceware.org
X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6
Message-ID: <8edfd4a5-58b9-4439-add1-66830aa48f90@towo.net>
Date: Mon, 4 Nov 2024 13:31:49 +0100
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: Is this correct behaviour for 'rev'?
To: cygwin AT cygwin DOT com
References: <CAKwdsS9FGm9nqtZ+vSQ+WEWzRf-zUFAS06eo=ASwNB6ST3gddw AT mail DOT gmail DOT com>
<6fdbf92d-51f2-47ae-a482-5edd89ed3a89 AT maxrnd DOT com>
<f58d4a6c-476d-4cc5-bad5-28c99ad75c2b AT maxrnd DOT com>
<7618ad16-fc5a-4c5c-bce2-25915c2f2cc8 AT maxrnd DOT com>
<CAKwdsS8McuC6Bw_va7DOzBr1wpOWNNU7hcrH8cjuaCuRF0mb4Q AT mail DOT gmail DOT com>
<4b8d7a6c-c070-4c90-a3ae-c4d87a5fbe6b AT towo DOT net>
<CAKwdsS_OAhOnKWMs0Y5+tRs5ShmocbvpAo4UwbanvA7MiH7=Jw AT mail DOT gmail DOT com>
Autocrypt: addr=towo AT towo DOT net; keydata=
xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu
KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ
bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK
fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11
Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1
zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN
BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m
7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB
BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID
AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9
+AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve
5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY
GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72
5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B
I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4
uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka
UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded
KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc
k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te
D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD
G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ
0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk
UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P
agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA
IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF
mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW
6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F
PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa
QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF
FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP
y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP
iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ==
In-Reply-To: <CAKwdsS_OAhOnKWMs0Y5+tRs5ShmocbvpAo4UwbanvA7MiH7=Jw@mail.gmail.com>
X-Provags-ID: V03:K1:n8MYjCD4wFub4FIzr1vHkBZEPg7bE04+jrasNxQD1nF2rtEItbV
FygsCMdd+ycMQUF1B21tNorIi6UtK3yGYoq8QEwF8dHdap+nIo66FhgYWazqU9IrUd34452
BES35MwvjP/Qmwjw3LW5X7/cshxRYPLp7TN6qol+lyQpH737M8nfAHn0ExFuU+75AIN0ptD
2Cc+Nzet7sK8AMz1QYzSg==
UI-OutboundReport: notjunk:1;M01:P0:lN6uWIew18c=;hrX8C7s849deVxlOedXnQDNY9yl
o2FuQyovk/IKpCFxSEjta/VVkF/i33fP/1b/bELPQ4YpOryMZdaYfNerovRm7h7c2m3VHBDV1
uUYLtAq3kyLtkpc0164cPDrMH1PYH8sTfofb1D4Ft1U7MlMnaw01eh1+v2uQzDiCIT0zEvJzP
Ph0LvS85XM0+g9/pIx77Cy6oIwwRrDteLMeGXVhuJstdzZXnkIHSccIM0W4Fxx/WsYi9zEIDl
i2Z835onxeijh/Yo8JAQdtA9bxXP+UI/K86fyDz8B0a4eg993JPsSKmAHEFJkqS0+z26ve15D
jy4onhIox3pHlEeVFC/7ickCXGJIHKsEt0zOib+RPoZtpz+Dg557pIG4aCQZ1mM28ICELxenM
oLA11RWdtSmRZWRpLa/SJMO2rHu127BoMaynU7fppaQbwEgg07t1LS4fe1Or2fIPBgfm+abLT
q1mpYd6U9Ek88G9r5GJ8Gj/OgnCsDr3C0z4Df916/bJGhiyqxDxN2uL09grnu2dsWIBCJLycN
9c0bX6izIhDIP3uIbAHW7qbVawNwPqAgAKpPy3hyRlToi5GuHHjTeMkb3CBzIcgA4cU+yOumI
wSalJOdM3wXZknJunUSGtFjtNP19EiYxQvW/OJUq78ccnUNCUIC/oqwE7WIPWQARSKbCuuJwD
rdpfB/NznvwCk95bC5mhOniZR0MA10OAQke/Q8v7vjrzeogaB37KLjtLDQwEsSCFK9SkYJ39j
s7eVcg0UZoWTYVsyfYQg1t5gKSTTbmwtw==
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Thomas Wolff via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Thomas Wolff <towo AT towo DOT net>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 4A4CWxLL1386430

Am 04.11.2024 um 12:10 schrieb Backwoods BC via Cygwin:
> On Sun, Nov 3, 2024 at 11:42 PM Thomas Wolff via Cygwin
> <cygwin AT cygwin DOT com> wrote:
>> Am 04.11.2024 um 05:56 schrieb Backwoods BC via Cygwin:
>>> On Sun, Nov 3, 2024 at 1:49 AM Mark Geisert via Cygwin
>>> <cygwin AT cygwin DOT com> wrote:
>>>> Continuing my monologue, with due consideration of comments posted, ...
>>>>
>>>> On 10/23/2024 10:01 PM, Mark Geisert via Cygwin wrote:
>>>>> Replying to myself, I continue...
>>>>>
>>>>> On 10/22/2024 10:33 PM, Mark Geisert via Cygwin wrote:
>>>>>> On 10/22/2024 8:00 PM, Backwoods BC via Cygwin wrote:
>>>>>>> It appears that 'rev' is choking on any character \x80 or higher, but
>>>>>>> is OK with those \x1f or smaller. It doesn't give an error or ignore
>>>>>>> it, it just stops.
>>>>>>>
>>>>>>> I don't have access to a Linux box so I can't see if this happens
>>>>>>> there and nothing in the documentation suggests that this is the
>>>>>>> correct functionality.
>>>>>>>
>>>>>>> Test case:
>>>>>>> printf 'no non-ASCII characters\nhex 01 >\x01< here\nhex 80 >\x80<
>>>>>>> here\nLine 4\n'|rev|rev
>>>>>>>
>>>>>>> This is for "rev from util-linux 2.33.1"
>>>>>>>
>>>>>>> I don't have the current version of 'rev' on my system due to not
>>>>>>> having updated in a while. I accidentally screwed up my installation
>>>>>>> and have been reluctant to wipe it and start over.
>>>>>>>
>>>>>>> So, is this the expected behaviour for the current version of 'rev'
>>>>>>> under Cygwin and/or Linux?
>>>>>> The current Cygwin util-linux 2.39.3-2 rev behaves in the same, broken
>>>>>> way.  It looks like line-ending char(s) are not being handled
>>>>>> correctly.   Don't know yet if it's rev itself or fgetws() being used
>>>>>> by rev that's busted.  I'll investigate further.  Thanks for the report!
>>>>> This is a locale issue.  In the default Cygwin locale, rev mishandles
>>>>> the \x80 byte and instead of stopping with an error message it enters an
>>>>> infinite loop.  I'll probably report this upstream instead of working
>>>>> out a local fix.
>>>> Upstream util-linux 2.40.2 has an updated 'rev' that stops with an error
>>>> message when the OP's testcase is tried.  I'm testing the full 2.40.2
>>>> for Cygwin release before too long.
>>>>
>>>>> There is a work-around: change to the "C" locale just to run rev.
>>>>>        LC_ALL=C rev zzz
>>>>> where zzz is a file containing your four lines.  You can also run your
>>>>> original testcase with "rev" replaced by "LC_ALL=C rev" in both places.
>>>> Implicit in that suggestion is that the OP seemed to be uninterested in
>>>> any form of multi-byte characters.. just straightforward operation on
>>>> bytes, even if they have the high bit set.
>>>>
>>>> That said, I appreciate the follow-up comments that dealt with the
>>>> general problem.
>>>> Thanks all,
>>>>
>>>> ..mark
>>> Sorry for dropping out of the thread. I lost interest in pursuing the
>>> issue once I learned that 'rev' would balk at any character it didn't
>>> like instead of just passing it through, and found a workaround for my
>>> case. What I really wanted is something that would do a byte-by-byte
>>> reversal working backwards from a LF character.
>>>
>>> My use for 'rev' is to allow sorting based on field position from the
>>> *end* of the line. 'sort' won't do this itself, as far as I can tell.
>>> My method follows:
>>> printf -v mySep '\xff'
>>> cat fileOfFullPathNames | rev | sed -r -e "s/\./$mySep/" | rev | sort
>>> -t "$mySep" --key=2.1 | tr "$mySep" '.'
>>>
>>> This particular pipe is to sort fileOfFullPathNames by file extension.
>>> As mentioned, this stops abruptly when it encounters my inserted field
>>> separator of \xff. I found that it would do what I wanted if I used
>>> \x1f as mySep instead.
>>>
>>> To be honest, in far too many years of using *nix as a user (not a
>>> developer), doing this kind of thing is the only use I've ever had for
>>> 'rev'. I probably used a different separator before (likely \x09)
>>> which is why I haven't encountered an issue.
>>>
>>> What I appear to really need is "rev --binary" that just reverses
>>> everything regardless of what it is until it finds a LF. I may get
>>> motivated to write it for myself if I run into situations where I
>>> can't work around the restrictions in 'rev'.
>> As noted before in this thread, "rev --binary" is "LC_ALL=C rev".
> When 'rev' gets fixed, I'll try that. Until then, I'll just work
> around it as "LC_ALL=C rev" still dies when it encounters any byte
>> =\x80.
Well, it doesn't for me:
 > printf a'\x80'b | LC_ALL=C rev | od -t x1
0000000 62 80 61


-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019