delorie.com/archives/browse.cgi | search |
DKIM-Filter: | OpenDKIM Filter v2.11.0 delorie.com 4A44vLGE1258004 |
Authentication-Results: | delorie.com; |
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=QI3AqFJz | |
X-Recipient: | archive-cygwin AT delorie DOT com |
DKIM-Filter: | OpenDKIM Filter v2.11.0 sourceware.org E51383858C2B |
DKIM-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; |
s=default; t=1730696239; | |
bh=wXCeoPno+/QDpHW50q7ByBYOvSp+dQTH5qod8DnQW18=; | |
h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe: | |
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: | |
From; | |
b=QI3AqFJz8OVYuuWxwcc0dQuh7L2+LofFlVvK7lFkaHkMf/aq80rB1VQXXW/WvgFwV | |
8qowJrRS3DnUb8fIQ8nwm1cvHuFOavEu+9wFhnb3aLfB/nrALsBctt6T/Sd1UBqA73 | |
5uWU2YgW8s/E5eyo410lkBQMGikWXZ1aLbecacJ8= | |
X-Original-To: | cygwin AT cygwin DOT com |
Delivered-To: | cygwin AT cygwin DOT com |
DMARC-Filter: | OpenDMARC Filter v1.4.2 sourceware.org 982653858D29 |
ARC-Filter: | OpenARC Filter v1.0.0 sourceware.org 982653858D29 |
ARC-Seal: | i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730696215; cv=none; |
b=obgxKNzaLlVp1f6lqMbpZ3e9ietXY0ZLU5ujmnRw4CIeZG7BP/xsCBMRDCUD0TgzIV4oPcaDePRsdN2WJRxc5YG6juot+T3kF/0OT+M5gPVZpICz44FnazvA740u9mDQZkSdGRrfncAusXQA3MxHsYdrGEMODGrVoyc2KcKEsp4= | |
ARC-Message-Signature: | i=1; a=rsa-sha256; d=sourceware.org; s=key; |
t=1730696215; c=relaxed/simple; | |
bh=SPxvPW1q8Rxkm/UQoETFidwPq9ig3RcFPtmWnrmqM10=; | |
h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; | |
b=ua6cMInOQmXA62+9UdlAYbJFcOCvXJxsl8iYNJR0Yq9dMzYDuDd25aEHvzSYpVbxFfTDiD9IAKvd2VRVLSmxEsslRkklbLaY5v7UvUe64bMaPhWIytDe6xw8d/cHDI/iEpD4sANYU3plc2qPlafggcdsAMsFW/xBCNNSdt6EN0s= | |
ARC-Authentication-Results: | i=1; server2.sourceware.org |
X-Google-DKIM-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; |
d=1e100.net; s=20230601; t=1730696212; x=1731301012; | |
h=content-transfer-encoding:to:subject:message-id:date:from | |
:in-reply-to:references:mime-version:x-gm-message-state:from:to:cc | |
:subject:date:message-id:reply-to; | |
bh=UfPeCGEO799iyBSoYr4DgH0ecjBKu2Jx0uBI8/ZdEWk=; | |
b=iZnQEySiwJy+p8b9tmrQPTWFh+odQuE/Smpj83lC0nwdd/MeyZ5Uj5WoPauFr5r2++ | |
ozGFuqRwGOzESX4Uxe/noYrQjFr0HU70DXpALdVs8vgnYr6OWLqEE6sA2D0MJnJz8g7s | |
bjz2mpJPwIRbQHoLhdotrtEmJdND02hR466xO4md2aYhkwDXf6c1ps0BA/1Zsdo/+YIY | |
Gh8PfbJFvG+tt/iGne1MjffGNtL6id1rZQ70gdGVc8cQ4xc6u5W9DcoN0l79inOjUpik | |
FM0xxMk/XuO0QOKNbCkxIUCPY+My63zQxZ2NwG0ZEO3OmMuhnLSr26kWeCp6GcvFJjGI | |
XC9Q== | |
X-Gm-Message-State: | AOJu0YzezByYZORRgXB5cq4JwSGbNUr5c78jyXnVxoEuEDvZdj69PmNc |
ruF5mvoRvPQ67dgwSa/WMeBvYMzVblrtMk+GfiRkZ9UEOD5s4dl3kc045rqt31NgkyVi1ID6TDo | |
y6/+ifGMq9GDPtVBCpyZllPQuWeleV1KdFbI= | |
X-Google-Smtp-Source: | AGHT+IEsZiSEk35KvnF48nqYd9OLSD5Az4vFsfMuYPxSWGmhDc+r0J2kRvr4nRYTFw8vvrluuC5OmQ9OjAeroHY1yAg= |
X-Received: | by 2002:a05:6402:13cf:b0:5ce:bc80:9467 with SMTP id |
4fb4d7f45d1cf-5cebc809578mr10374072a12.19.1730696211704; Sun, 03 Nov 2024 | |
20:56:51 -0800 (PST) | |
MIME-Version: | 1.0 |
References: | <CAKwdsS9FGm9nqtZ+vSQ+WEWzRf-zUFAS06eo=ASwNB6ST3gddw AT mail DOT gmail DOT com> |
<6fdbf92d-51f2-47ae-a482-5edd89ed3a89 AT maxrnd DOT com> | |
<f58d4a6c-476d-4cc5-bad5-28c99ad75c2b AT maxrnd DOT com> | |
<7618ad16-fc5a-4c5c-bce2-25915c2f2cc8 AT maxrnd DOT com> | |
In-Reply-To: | <7618ad16-fc5a-4c5c-bce2-25915c2f2cc8@maxrnd.com> |
Date: | Sun, 3 Nov 2024 20:56:38 -0800 |
Message-ID: | <CAKwdsS8McuC6Bw_va7DOzBr1wpOWNNU7hcrH8cjuaCuRF0mb4Q@mail.gmail.com> |
Subject: | Re: Is this correct behaviour for 'rev'? |
To: | cygwin AT cygwin DOT com |
X-BeenThere: | cygwin AT cygwin DOT com |
X-Mailman-Version: | 2.1.30 |
List-Id: | General Cygwin discussions and problem reports <cygwin.cygwin.com> |
List-Archive: | <https://cygwin.com/pipermail/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-request AT cygwin DOT com?subject=help> |
List-Subscribe: | <https://cygwin.com/mailman/listinfo/cygwin>, |
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe> | |
From: | Backwoods BC via Cygwin <cygwin AT cygwin DOT com> |
Reply-To: | Backwoods BC <completely DOT and DOT totally DOT trash AT gmail DOT com> |
Sender: | "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com> |
X-MIME-Autoconverted: | from base64 to 8bit by delorie.com id 4A44vLGE1258004 |
On Sun, Nov 3, 2024 at 1:49 AM Mark Geisert via Cygwin <cygwin AT cygwin DOT com> wrote: > > Continuing my monologue, with due consideration of comments posted, ... > > On 10/23/2024 10:01 PM, Mark Geisert via Cygwin wrote: > > Replying to myself, I continue... > > > > On 10/22/2024 10:33 PM, Mark Geisert via Cygwin wrote: > >> On 10/22/2024 8:00 PM, Backwoods BC via Cygwin wrote: > >>> It appears that 'rev' is choking on any character \x80 or higher, but > >>> is OK with those \x1f or smaller. It doesn't give an error or ignore > >>> it, it just stops. > >>> > >>> I don't have access to a Linux box so I can't see if this happens > >>> there and nothing in the documentation suggests that this is the > >>> correct functionality. > >>> > >>> Test case: > >>> printf 'no non-ASCII characters\nhex 01 >\x01< here\nhex 80 >\x80< > >>> here\nLine 4\n'|rev|rev > >>> > >>> This is for "rev from util-linux 2.33.1" > >>> > >>> I don't have the current version of 'rev' on my system due to not > >>> having updated in a while. I accidentally screwed up my installation > >>> and have been reluctant to wipe it and start over. > >>> > >>> So, is this the expected behaviour for the current version of 'rev' > >>> under Cygwin and/or Linux? > >> > >> The current Cygwin util-linux 2.39.3-2 rev behaves in the same, broken > >> way. It looks like line-ending char(s) are not being handled > >> correctly. Don't know yet if it's rev itself or fgetws() being used > >> by rev that's busted. I'll investigate further. Thanks for the report! > > > > This is a locale issue. In the default Cygwin locale, rev mishandles > > the \x80 byte and instead of stopping with an error message it enters an > > infinite loop. I'll probably report this upstream instead of working > > out a local fix. > > Upstream util-linux 2.40.2 has an updated 'rev' that stops with an error > message when the OP's testcase is tried. I'm testing the full 2.40.2 > for Cygwin release before too long. > > > There is a work-around: change to the "C" locale just to run rev. > > LC_ALL=C rev zzz > > where zzz is a file containing your four lines. You can also run your > > original testcase with "rev" replaced by "LC_ALL=C rev" in both places. > > Implicit in that suggestion is that the OP seemed to be uninterested in > any form of multi-byte characters.. just straightforward operation on > bytes, even if they have the high bit set. > > That said, I appreciate the follow-up comments that dealt with the > general problem. > Thanks all, > > ..mark Sorry for dropping out of the thread. I lost interest in pursuing the issue once I learned that 'rev' would balk at any character it didn't like instead of just passing it through, and found a workaround for my case. What I really wanted is something that would do a byte-by-byte reversal working backwards from a LF character. My use for 'rev' is to allow sorting based on field position from the *end* of the line. 'sort' won't do this itself, as far as I can tell. My method follows: printf -v mySep '\xff' cat fileOfFullPathNames | rev | sed -r -e "s/\./$mySep/" | rev | sort -t "$mySep" --key=2.1 | tr "$mySep" '.' This particular pipe is to sort fileOfFullPathNames by file extension. As mentioned, this stops abruptly when it encounters my inserted field separator of \xff. I found that it would do what I wanted if I used \x1f as mySep instead. To be honest, in far too many years of using *nix as a user (not a developer), doing this kind of thing is the only use I've ever had for 'rev'. I probably used a different separator before (likely \x09) which is why I haven't encountered an issue. What I appear to really need is "rev --binary" that just reverses everything regardless of what it is until it finds a LF. I may get motivated to write it for myself if I run into situations where I can't work around the restrictions in 'rev'. -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |