delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2024/11/04/02:42:17

DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 4A47gHmp1311863
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=CtiP3Vf6
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 86A213858410
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1730706134;
bh=eB2Z4x0M4wyXdUnAX5m4NLlKXUHFrNjf7Bd0rTDYiBs=;
h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=CtiP3Vf6rjv/tbY2ifmwMbSzSqYUGDs/HChTwTVqiR4bsvf7zSg+K6e+NUVr3FDRN
+Y6rKHDOhL26Q9LIwmP/8pUYB41aUq4H1NILHbCA8E9AIINdpvc3wWndO724AqNK5N
dejpIdD2t7TZAltFAhThqbioirosmdYuvHmXEL4s=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6C0CA3858D29
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6C0CA3858D29
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730706112; cv=none;
b=EaLI85NwqOPZzehooflL8ifILejaT1xWDB0vNypKOwsk+d9iPMmR0cfL3WTxjzZT7XeinzqN5u/nHT+XKbYZFNvrhWPdebETMwPoTXBhOeXtdBYtSZjbQTNxQCei3VfslZsk+lt+QkKxjc2ymqfWhPdecwqVIyuykT8pz8h+ZR8=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1730706112; c=relaxed/simple;
bh=UXFl2xpQZk1DsQYpuQhtOaWpZuNXD1DTgNA/gSOg6KI=;
h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From;
b=mlBfPoqdaB/YF54jXfnLupirP1W74n5yYgMtPUo9hfqMelP3FP0bkXvJGxycvE6ykhfwLEMwZVo8C1HghUmd0WE1OhOFM6IsQCoL2wA8Qa+Of4OAUMJafspIbT4FW2VWKZBSML/bZ+X31Z0b5/l1Eo3R0EgG/nd771RDGFypxHk=
ARC-Authentication-Results: i=1; server2.sourceware.org
X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6
Message-ID: <4b8d7a6c-c070-4c90-a3ae-c4d87a5fbe6b@towo.net>
Date: Mon, 4 Nov 2024 08:41:46 +0100
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: Is this correct behaviour for 'rev'?
To: cygwin AT cygwin DOT com
References: <CAKwdsS9FGm9nqtZ+vSQ+WEWzRf-zUFAS06eo=ASwNB6ST3gddw AT mail DOT gmail DOT com>
<6fdbf92d-51f2-47ae-a482-5edd89ed3a89 AT maxrnd DOT com>
<f58d4a6c-476d-4cc5-bad5-28c99ad75c2b AT maxrnd DOT com>
<7618ad16-fc5a-4c5c-bce2-25915c2f2cc8 AT maxrnd DOT com>
<CAKwdsS8McuC6Bw_va7DOzBr1wpOWNNU7hcrH8cjuaCuRF0mb4Q AT mail DOT gmail DOT com>
Autocrypt: addr=towo AT towo DOT net; keydata=
xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu
KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ
bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK
fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11
Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1
zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN
BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m
7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB
BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID
AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9
+AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve
5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY
GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72
5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B
I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4
uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka
UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded
KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc
k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te
D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD
G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ
0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk
UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P
agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA
IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF
mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW
6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F
PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa
QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF
FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP
y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP
iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ==
In-Reply-To: <CAKwdsS8McuC6Bw_va7DOzBr1wpOWNNU7hcrH8cjuaCuRF0mb4Q@mail.gmail.com>
X-Provags-ID: V03:K1:5bkYOD3CP8bHVJ1zLa7I7aaXo2UGHpVMJntdMhEuTNa2UxWbX1H
fxMqrRsLumXiMSKYhCPZsOarv3d5Q4BegqanoYy00zxIvZ+O5xQLEed82638nC4zzeth3sP
7JSQUQX5h13hl53I96WkU5FvN8bgD1y7FDwPyF+mbJosDq5mrWSG71y1dNRD2S21g8K5K7i
y+MFueGB6TBWju6mBkOsg==
UI-OutboundReport: notjunk:1;M01:P0:g6HZ/BvzyPQ=;SOraM9EcmnoDv2xgHQPPryEPg3v
tKf4xbd2zh4uVEVXuRJt1qdpRFGxOtPd188225pcZR1XKww7cqmBIV0nrntWSQCyJcCIM43YF
yAgSteqhvSSNgny/SNLZV8Wlkf7DPJks9xPPN6h6MRsNyc843Hp7dNQnL/7eVtFkXeZLbNb8E
7lk7s7HrlPpt7p2F8+Vo7gy8lJ6AtlaSTryyyFo3tHgPZJOj7jcWDxb0XlLIFaYgbBWUgZuGs
BV8bauOz9fpobhjGV9pPYdSQ6D9W5U3euyAZ3EEraV5qHVbSU3XDEuHSQKqy3N3x6xyem7rY6
fX6+PSmsKaX/wS/TUuoU7w9Qf+wCi7bEukBfIRkPaGDEdZ2jwFDB66F1N4iE914p5Nwnf11+0
svZktNaQroF7S1KsZhKeE5+SN46EEBrasE3p7goowcTGIfPYeFUKENc99mrwzM3h7c7F9sKr7
pYQnRoD9mZMWsEW5ZNXkxXbLhuMhLH3CsNMD7FebBv1RBCzc0eWLPRq6VKdj4Fe6YnAliX9dK
/ssDPU+82u0Qhd99XhZyYbOBJSNH0k5vekmFMuXr5JG+yA9wZJW15xwnrFklpwvMMWpuPFjgE
YJ5lcX80QI5ozI0irvHMWOkbJGIDvRrCbJxzibehWnZC6dL0XpgWdFpz8MuWNRl+NNi7BYmHa
ob1pqz2tkCwKXXdtHLX3ahxJ5b0v6/25IzMpX81AzASHr3M/c9+koStP65zIyAwX8DzkXNAio
beHiY4eZM+pJwKQc68hZNivaWh3nvZ1IQ==
X-Content-Filtered-By: Mailman/MimeDel 2.1.30
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Thomas Wolff via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Thomas Wolff <towo AT towo DOT net>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 4A47gHmp1311863


Am 04.11.2024 um 05:56 schrieb Backwoods BC via Cygwin:
> On Sun, Nov 3, 2024 at 1:49 AM Mark Geisert via Cygwin
> <cygwin AT cygwin DOT com> wrote:
>> Continuing my monologue, with due consideration of comments posted, ...
>>
>> On 10/23/2024 10:01 PM, Mark Geisert via Cygwin wrote:
>>> Replying to myself, I continue...
>>>
>>> On 10/22/2024 10:33 PM, Mark Geisert via Cygwin wrote:
>>>> On 10/22/2024 8:00 PM, Backwoods BC via Cygwin wrote:
>>>>> It appears that 'rev' is choking on any character \x80 or higher, but
>>>>> is OK with those \x1f or smaller. It doesn't give an error or ignore
>>>>> it, it just stops.
>>>>>
>>>>> I don't have access to a Linux box so I can't see if this happens
>>>>> there and nothing in the documentation suggests that this is the
>>>>> correct functionality.
>>>>>
>>>>> Test case:
>>>>> printf 'no non-ASCII characters\nhex 01 >\x01< here\nhex 80 >\x80<
>>>>> here\nLine 4\n'|rev|rev
>>>>>
>>>>> This is for "rev from util-linux 2.33.1"
>>>>>
>>>>> I don't have the current version of 'rev' on my system due to not
>>>>> having updated in a while. I accidentally screwed up my installation
>>>>> and have been reluctant to wipe it and start over.
>>>>>
>>>>> So, is this the expected behaviour for the current version of 'rev'
>>>>> under Cygwin and/or Linux?
>>>> The current Cygwin util-linux 2.39.3-2 rev behaves in the same, broken
>>>> way.  It looks like line-ending char(s) are not being handled
>>>> correctly.   Don't know yet if it's rev itself or fgetws() being used
>>>> by rev that's busted.  I'll investigate further.  Thanks for the report!
>>> This is a locale issue.  In the default Cygwin locale, rev mishandles
>>> the \x80 byte and instead of stopping with an error message it enters an
>>> infinite loop.  I'll probably report this upstream instead of working
>>> out a local fix.
>> Upstream util-linux 2.40.2 has an updated 'rev' that stops with an error
>> message when the OP's testcase is tried.  I'm testing the full 2.40.2
>> for Cygwin release before too long.
>>
>>> There is a work-around: change to the "C" locale just to run rev.
>>>       LC_ALL=C rev zzz
>>> where zzz is a file containing your four lines.  You can also run your
>>> original testcase with "rev" replaced by "LC_ALL=C rev" in both places.
>> Implicit in that suggestion is that the OP seemed to be uninterested in
>> any form of multi-byte characters.. just straightforward operation on
>> bytes, even if they have the high bit set.
>>
>> That said, I appreciate the follow-up comments that dealt with the
>> general problem.
>> Thanks all,
>>
>> ..mark
> Sorry for dropping out of the thread. I lost interest in pursuing the
> issue once I learned that 'rev' would balk at any character it didn't
> like instead of just passing it through, and found a workaround for my
> case. What I really wanted is something that would do a byte-by-byte
> reversal working backwards from a LF character.
>
> My use for 'rev' is to allow sorting based on field position from the
> *end* of the line. 'sort' won't do this itself, as far as I can tell.
> My method follows:
> printf -v mySep '\xff'
> cat fileOfFullPathNames | rev | sed -r -e "s/\./$mySep/" | rev | sort
> -t "$mySep" --key=2.1 | tr "$mySep" '.'
>
> This particular pipe is to sort fileOfFullPathNames by file extension.
> As mentioned, this stops abruptly when it encounters my inserted field
> separator of \xff. I found that it would do what I wanted if I used
> \x1f as mySep instead.
>
> To be honest, in far too many years of using *nix as a user (not a
> developer), doing this kind of thing is the only use I've ever had for
> 'rev'. I probably used a different separator before (likely \x09)
> which is why I haven't encountered an issue.
>
> What I appear to really need is "rev --binary" that just reverses
> everything regardless of what it is until it finds a LF. I may get
> motivated to write it for myself if I run into situations where I
> can't work around the restrictions in 'rev'.
As noted before in this thread, "rev --binary" is "LC_ALL=C rev".

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019