delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2024/10/24/01:02:19

DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 49O52Jnt3883235
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=GlYL9ao6
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 44E253858406
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1729746137;
bh=3hSeK3ypvhpB0hWdxZyw2ANbX/b8HFM/mFW9omC/Ig0=;
h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=GlYL9ao6GbHtVaEwZaHTArlWuWZC0sYQJzlLSKPe6nXvd7/NKR6HQxqgeF5UYsZ48
JvN/zEqaF4eOC9YJhrQGjHaSJPlcS9Sp8iN2KvgMZp91fJcnvNnHF/8e5P+0QpXiE7
NeyS0eCxMEr8Ev3XkFBRaCpmt6qKJB6bQ8cMe0rY=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E08AE3858D21
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E08AE3858D21
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729746081; cv=none;
b=G0nlcuNHHJmW/olDIW02WCAn+C4gfOtySVo7bjzEZAg+3RmTi0Iu3y9lJXvY6+/Z8ssPzgsR3acszt4zucDaFulMXbRf8zmSQd/KT/qprHKjctV+g/ce2CX6kMkgUEh6UScZpDg8ETbmlTYDf5lVCm7FzdTv25F6228PjDf8ogA=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1729746081; c=relaxed/simple;
bh=RDadRYV8BYW/eSKp9DjO8RP2+4PhgG6RJZwEK+s2K0k=;
h=Message-ID:Date:MIME-Version:Subject:To:From;
b=UcvrB8MLz61zOEtt8gdOvO4m3z4wQDI9oAw6dWjAgDpHNBkR4DRGFhIDqlUtGWkJGuriCd9dmKnom6csG3LndAOPIro2wkzLfFKYgWwopzUdzsjSiCoTBMHnxehCxGoHYFoxzuFPld8b/FRCfMh1/Z0YWEQHjgR7r+FCExvfLWQ=
ARC-Authentication-Results: i=1; server2.sourceware.org
Message-ID: <f58d4a6c-476d-4cc5-bad5-28c99ad75c2b@maxrnd.com>
Date: Wed, 23 Oct 2024 22:01:22 -0700
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: Is this correct behaviour for 'rev'?
To: cygwin AT cygwin DOT com
References: <CAKwdsS9FGm9nqtZ+vSQ+WEWzRf-zUFAS06eo=ASwNB6ST3gddw AT mail DOT gmail DOT com>
<6fdbf92d-51f2-47ae-a482-5edd89ed3a89 AT maxrnd DOT com>
In-Reply-To: <6fdbf92d-51f2-47ae-a482-5edd89ed3a89@maxrnd.com>
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Mark Geisert via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Mark Geisert <mark AT maxrnd DOT com>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 49O52Jnt3883235

Replying to myself, I continue...

On 10/22/2024 10:33 PM, Mark Geisert via Cygwin wrote:
> On 10/22/2024 8:00 PM, Backwoods BC via Cygwin wrote:
>> It appears that 'rev' is choking on any character \x80 or higher, but
>> is OK with those \x1f or smaller. It doesn't give an error or ignore
>> it, it just stops.
>>
>> I don't have access to a Linux box so I can't see if this happens
>> there and nothing in the documentation suggests that this is the
>> correct functionality.
>>
>> Test case:
>> printf 'no non-ASCII characters\nhex 01 >\x01< here\nhex 80 >\x80<
>> here\nLine 4\n'|rev|rev
>>
>> This is for "rev from util-linux 2.33.1"
>>
>> I don't have the current version of 'rev' on my system due to not
>> having updated in a while. I accidentally screwed up my installation
>> and have been reluctant to wipe it and start over.
>>
>> So, is this the expected behaviour for the current version of 'rev'
>> under Cygwin and/or Linux?
> 
> The current Cygwin util-linux 2.39.3-2 rev behaves in the same, broken 
> way.  It looks like line-ending char(s) are not being handled correctly. 
>   Don't know yet if it's rev itself or fgetws() being used by rev that's 
> busted.  I'll investigate further.  Thanks for the report!

This is a locale issue.  In the default Cygwin locale, rev mishandles 
the \x80 byte and instead of stopping with an error message it enters an 
infinite loop.  I'll probably report this upstream instead of working 
out a local fix.

There is a work-around: change to the "C" locale just to run rev.
     LC_ALL=C rev zzz
where zzz is a file containing your four lines.  You can also run your 
original testcase with "rev" replaced by "LC_ALL=C rev" in both places.
HTH,

..mark

P.S. ASCII runs from \x00 to \x7F, so your \x80 is non-ASCII FWIW ;-)

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019