DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 4A47gHmp1311863 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=CtiP3Vf6 X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 86A213858410 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1730706134; bh=eB2Z4x0M4wyXdUnAX5m4NLlKXUHFrNjf7Bd0rTDYiBs=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=CtiP3Vf6rjv/tbY2ifmwMbSzSqYUGDs/HChTwTVqiR4bsvf7zSg+K6e+NUVr3FDRN +Y6rKHDOhL26Q9LIwmP/8pUYB41aUq4H1NILHbCA8E9AIINdpvc3wWndO724AqNK5N dejpIdD2t7TZAltFAhThqbioirosmdYuvHmXEL4s= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6C0CA3858D29 ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6C0CA3858D29 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730706112; cv=none; b=EaLI85NwqOPZzehooflL8ifILejaT1xWDB0vNypKOwsk+d9iPMmR0cfL3WTxjzZT7XeinzqN5u/nHT+XKbYZFNvrhWPdebETMwPoTXBhOeXtdBYtSZjbQTNxQCei3VfslZsk+lt+QkKxjc2ymqfWhPdecwqVIyuykT8pz8h+ZR8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730706112; c=relaxed/simple; bh=UXFl2xpQZk1DsQYpuQhtOaWpZuNXD1DTgNA/gSOg6KI=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=mlBfPoqdaB/YF54jXfnLupirP1W74n5yYgMtPUo9hfqMelP3FP0bkXvJGxycvE6ykhfwLEMwZVo8C1HghUmd0WE1OhOFM6IsQCoL2wA8Qa+Of4OAUMJafspIbT4FW2VWKZBSML/bZ+X31Z0b5/l1Eo3R0EgG/nd771RDGFypxHk= ARC-Authentication-Results: i=1; server2.sourceware.org X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6 Message-ID: <4b8d7a6c-c070-4c90-a3ae-c4d87a5fbe6b@towo.net> Date: Mon, 4 Nov 2024 08:41:46 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Is this correct behaviour for 'rev'? To: cygwin AT cygwin DOT com References: <6fdbf92d-51f2-47ae-a482-5edd89ed3a89 AT maxrnd DOT com> <7618ad16-fc5a-4c5c-bce2-25915c2f2cc8 AT maxrnd DOT com> Autocrypt: addr=towo AT towo DOT net; keydata= xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11 Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1 zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m 7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9 +AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve 5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72 5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4 uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ 0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW 6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ== In-Reply-To: X-Provags-ID: V03:K1:5bkYOD3CP8bHVJ1zLa7I7aaXo2UGHpVMJntdMhEuTNa2UxWbX1H fxMqrRsLumXiMSKYhCPZsOarv3d5Q4BegqanoYy00zxIvZ+O5xQLEed82638nC4zzeth3sP 7JSQUQX5h13hl53I96WkU5FvN8bgD1y7FDwPyF+mbJosDq5mrWSG71y1dNRD2S21g8K5K7i y+MFueGB6TBWju6mBkOsg== UI-OutboundReport: notjunk:1;M01:P0:g6HZ/BvzyPQ=;SOraM9EcmnoDv2xgHQPPryEPg3v tKf4xbd2zh4uVEVXuRJt1qdpRFGxOtPd188225pcZR1XKww7cqmBIV0nrntWSQCyJcCIM43YF yAgSteqhvSSNgny/SNLZV8Wlkf7DPJks9xPPN6h6MRsNyc843Hp7dNQnL/7eVtFkXeZLbNb8E 7lk7s7HrlPpt7p2F8+Vo7gy8lJ6AtlaSTryyyFo3tHgPZJOj7jcWDxb0XlLIFaYgbBWUgZuGs BV8bauOz9fpobhjGV9pPYdSQ6D9W5U3euyAZ3EEraV5qHVbSU3XDEuHSQKqy3N3x6xyem7rY6 fX6+PSmsKaX/wS/TUuoU7w9Qf+wCi7bEukBfIRkPaGDEdZ2jwFDB66F1N4iE914p5Nwnf11+0 svZktNaQroF7S1KsZhKeE5+SN46EEBrasE3p7goowcTGIfPYeFUKENc99mrwzM3h7c7F9sKr7 pYQnRoD9mZMWsEW5ZNXkxXbLhuMhLH3CsNMD7FebBv1RBCzc0eWLPRq6VKdj4Fe6YnAliX9dK /ssDPU+82u0Qhd99XhZyYbOBJSNH0k5vekmFMuXr5JG+yA9wZJW15xwnrFklpwvMMWpuPFjgE YJ5lcX80QI5ozI0irvHMWOkbJGIDvRrCbJxzibehWnZC6dL0XpgWdFpz8MuWNRl+NNi7BYmHa ob1pqz2tkCwKXXdtHLX3ahxJ5b0v6/25IzMpX81AzASHr3M/c9+koStP65zIyAwX8DzkXNAio beHiY4eZM+pJwKQc68hZNivaWh3nvZ1IQ== X-Content-Filtered-By: Mailman/MimeDel 2.1.30 X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Thomas Wolff via Cygwin Reply-To: Thomas Wolff Content-Type: text/plain; charset="utf-8"; Format="flowed" Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 4A47gHmp1311863 Am 04.11.2024 um 05:56 schrieb Backwoods BC via Cygwin: > On Sun, Nov 3, 2024 at 1:49 AM Mark Geisert via Cygwin > wrote: >> Continuing my monologue, with due consideration of comments posted, ... >> >> On 10/23/2024 10:01 PM, Mark Geisert via Cygwin wrote: >>> Replying to myself, I continue... >>> >>> On 10/22/2024 10:33 PM, Mark Geisert via Cygwin wrote: >>>> On 10/22/2024 8:00 PM, Backwoods BC via Cygwin wrote: >>>>> It appears that 'rev' is choking on any character \x80 or higher, but >>>>> is OK with those \x1f or smaller. It doesn't give an error or ignore >>>>> it, it just stops. >>>>> >>>>> I don't have access to a Linux box so I can't see if this happens >>>>> there and nothing in the documentation suggests that this is the >>>>> correct functionality. >>>>> >>>>> Test case: >>>>> printf 'no non-ASCII characters\nhex 01 >\x01< here\nhex 80 >\x80< >>>>> here\nLine 4\n'|rev|rev >>>>> >>>>> This is for "rev from util-linux 2.33.1" >>>>> >>>>> I don't have the current version of 'rev' on my system due to not >>>>> having updated in a while. I accidentally screwed up my installation >>>>> and have been reluctant to wipe it and start over. >>>>> >>>>> So, is this the expected behaviour for the current version of 'rev' >>>>> under Cygwin and/or Linux? >>>> The current Cygwin util-linux 2.39.3-2 rev behaves in the same, broken >>>> way. It looks like line-ending char(s) are not being handled >>>> correctly. Don't know yet if it's rev itself or fgetws() being used >>>> by rev that's busted. I'll investigate further. Thanks for the report! >>> This is a locale issue. In the default Cygwin locale, rev mishandles >>> the \x80 byte and instead of stopping with an error message it enters an >>> infinite loop. I'll probably report this upstream instead of working >>> out a local fix. >> Upstream util-linux 2.40.2 has an updated 'rev' that stops with an error >> message when the OP's testcase is tried. I'm testing the full 2.40.2 >> for Cygwin release before too long. >> >>> There is a work-around: change to the "C" locale just to run rev. >>> LC_ALL=C rev zzz >>> where zzz is a file containing your four lines. You can also run your >>> original testcase with "rev" replaced by "LC_ALL=C rev" in both places. >> Implicit in that suggestion is that the OP seemed to be uninterested in >> any form of multi-byte characters.. just straightforward operation on >> bytes, even if they have the high bit set. >> >> That said, I appreciate the follow-up comments that dealt with the >> general problem. >> Thanks all, >> >> ..mark > Sorry for dropping out of the thread. I lost interest in pursuing the > issue once I learned that 'rev' would balk at any character it didn't > like instead of just passing it through, and found a workaround for my > case. What I really wanted is something that would do a byte-by-byte > reversal working backwards from a LF character. > > My use for 'rev' is to allow sorting based on field position from the > *end* of the line. 'sort' won't do this itself, as far as I can tell. > My method follows: > printf -v mySep '\xff' > cat fileOfFullPathNames | rev | sed -r -e "s/\./$mySep/" | rev | sort > -t "$mySep" --key=2.1 | tr "$mySep" '.' > > This particular pipe is to sort fileOfFullPathNames by file extension. > As mentioned, this stops abruptly when it encounters my inserted field > separator of \xff. I found that it would do what I wanted if I used > \x1f as mySep instead. > > To be honest, in far too many years of using *nix as a user (not a > developer), doing this kind of thing is the only use I've ever had for > 'rev'. I probably used a different separator before (likely \x09) > which is why I haven't encountered an issue. > > What I appear to really need is "rev --binary" that just reverses > everything regardless of what it is until it finds a LF. I may get > motivated to write it for myself if I run into situations where I > can't work around the restrictions in 'rev'. As noted before in this thread, "rev --binary" is "LC_ALL=C rev". -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple