| delorie.com/archives/browse.cgi | search |
| X-Recipient: | archive-cygwin AT delorie DOT com |
| DomainKey-Signature: | a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id |
| :list-unsubscribe:list-subscribe:list-archive:list-post | |
| :list-help:sender:from:to:references:in-reply-to:subject:date | |
| :message-id:mime-version:content-type:content-transfer-encoding; | |
| q=dns; s=default; b=UlL7P5tGZwgP8/geO60b8DSu4s4phvT3BtAc2HvzeQA | |
| Cwj/4tP4dZK5FmBZMHoD4ty8BUi5lHCr9Knl035sZkfoP4KocKZXXmP+bGic6I2g | |
| 6Amu4u/GnAiLfQ6Yimn27f5h6lxfN5Mi5edwepm5XyrmGliR1umXVBbU7xCPtaR4 | |
| = | |
| DKIM-Signature: | v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id |
| :list-unsubscribe:list-subscribe:list-archive:list-post | |
| :list-help:sender:from:to:references:in-reply-to:subject:date | |
| :message-id:mime-version:content-type:content-transfer-encoding; | |
| s=default; bh=1NU2Z+I8pA5tPMM5XnvaehawKKo=; b=kUSOUEagD7vGvV8Np | |
| qonyocD4rT5TtSGom+Kwz8AV304aZ2Pke3UYqmkdQp4LWm2eTT7Eb4wJ18s6Fzoy | |
| BZrx9Ew3jrD85HF2z23IJYHulj5tyRqcQMYWtWFbxvAH7/DW3s84fOQ3ooItEkwE | |
| b5BJZdT5IQFUiElwqYlQFIbBuQ= | |
| Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
| List-Id: | <cygwin.cygwin.com> |
| List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
| List-Archive: | <http://sourceware.org/ml/cygwin/> |
| List-Post: | <mailto:cygwin AT cygwin DOT com> |
| List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
| Sender: | cygwin-owner AT cygwin DOT com |
| Mail-Followup-To: | cygwin AT cygwin DOT com |
| Delivered-To: | mailing list cygwin AT cygwin DOT com |
| Authentication-Results: | sourceware.org; auth=none |
| X-Virus-Found: | No |
| X-Spam-SWARE-Status: | No, score=-3.2 required=5.0 tests=AWL,BAYES_00,CYGWIN_OWNER_BODY,FREEMAIL_FROM,GIT_PATCH_2,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=krebs, Krebs, Roger, substance |
| X-HELO: | mout.gmx.net |
| From: | "Jannick" <thirdedition AT gmx DOT net> |
| To: | "'Roger Krebs'" <Roger DOT Krebs AT stage-entertainment DOT com>, <cygwin AT cygwin DOT com> |
| References: | <004401d3109c$2dcb09e0$89611da0$@gmx.net> <598a47fc DOT 5501ca0a DOT 5476f DOT 0305 AT mx DOT google DOT com> <004701d310a9$372363e0$a56a2ba0$@gmx.net> <DB6PR0601MB2085E2D695EB6D60B09FE77BBA8B0 AT DB6PR0601MB2085 DOT eurprd06 DOT prod DOT outlook DOT com> |
| In-Reply-To: | <DB6PR0601MB2085E2D695EB6D60B09FE77BBA8B0@DB6PR0601MB2085.eurprd06.prod.outlook.com> |
| Subject: | RE: gawk 4.1.4: CR separate char for CRLF files |
| Date: | Wed, 9 Aug 2017 10:37:58 +0200 |
| Message-ID: | <001001d310ea$ceeee230$6ccca690$@gmx.net> |
| MIME-Version: | 1.0 |
| X-UI-Out-Filterresults: | notjunk:1;V01:K0:KzYcuW8f3yU=:8HlY2Q/Tbrc+ad1wk9sXaa ToqAQynImXfBpr8TPExuzZkWvQNVUIvIryLFzQ1Sxw0IK4ZdWbX/YN0u6WEl7iylLH8uLJgpe Uw8S/wsu95AKfQH8XaqldNq6rhks7PV6kEJIJDAkT0L94gwGlYHgpIIWwHs9GU/fWQAReexPq 3SGE1m64IH/T5VhrSch9/d/6GJaYRRKJo1lDEYxIvZOMqhAFGiaCD39pjLmpRWOwd08ygkdXO dUhfifsHe3rg/roddTw4r5Ccse4vvGPswEp1+G1R6nacoAw2XY+YeEQRbr9sRmKj+a+NkQ4ja nJsoj8M51yb0EQ4HYV0pr3DPPgH0OjPfu9Rn9hBqy8pPlReTZUc1gTepOsQMs+jFI3lsiwLb9 Acxru/ARLpMPtZYgkrBPuQV0eNtRTVV7HRCZ4AWIj+IfMYBP8VxTn2MBcLqilBzpw6Qif3N15 8+OuNgNrVLZlwtE93dqK3xrVfbkGe/cwo+TkXQAa6Ofnafm9RMvh3O63fCLU885wqCsblHiQ5 icNxZqbF5C3EqZCYHNWXlgT51wUeFfpoA7YwZWRgqp5QzGdq8gimlswq0ROz8q5IrLz6LAPK3 jZvgcu4giCGG/DJLXLyy4GiExZ6sgliXCIgPhZDD4LkpQmADM3MuRcV6QvioF4AM99p33aoWS AuKZ4QCi925OTQbxy7J1I6BUqLqvwWB4zvdYUMNTMJ4cdqoWj2Mm2npBStDJzgVppwAhxrB8L N8evC9nJv7Tgs9sr6+QCsYR6aSqO+sv1fRb9gkzWQ4sHfv12NGgtFtxcJ0gmHBXasAF+n/BhQ UPeKjPEI6YZQW1kSiInPV758+9zBBNn53AFNwYrXZocWKv77bk= |
| X-IsSubscribed: | yes |
| X-MIME-Autoconverted: | from quoted-printable to 8bit by delorie.com id v798cPPX004146 |
Hi Roger,
On Wed, 9 Aug 2017 07:03:24 +0000, Roger Krebs wrote:
> I've added a BEGIN section at the beginning awk sript file setting the
record
> separator explicitly for the input file (RS) as well as for the output
file (ORS):
>
> BEGIN {
> RS="\r\n"
> ORS="\r\n"
> }
> {
> ... your script
> }
>
> Especially the RS parameter wasn't necessary in the past but now it is.
Which is a pretty much of a pain when there is no easy fallback solution
provided in case a major change is applied. E.g. for sed - if I understand
the reference to sed in https://cygwin.com/ml/cygwin/2017-08/msg00033.html
correctly - a separate switch '-b' is added. For the latest gawk version I
cannot see anything like that which means that all of our awk scripts run
against cygwin's gawk do break without any tweak unless I am missing
anything here.
This is - to say the least - unpleasant in the light of what Cygwin claims
to be, namely 'a large collection of GNU and Open Source tools which provide
functionality similar to a Linux distribution on Windows' (from the top of
the start website www.cygwin.com). Again, admittedly I did not dive into the
discussion and the substance of the reasoning to make this move to gawk |
sed | grep.
Now I can see the following *easy* solutions to the very situation here
(input only for now):
1 - Inserting the BEGIN section as you suggested into more than 1k scripts
(not feasible due to additional regression test workload)
2 - Calling 'gawk -vRS=\r\n -vORS=\r\n' instead of 'gawk' (hack to turn back
the additional the latest gawk's complexity, wrapper needed)
3 - Wrapping a d2u/u2d pipe solution (additional app and wrapper needed
again)
4 - Using another compiled version of gawk which does *not* disable the
out-of-the-box gawk feature to swallow CRs (cf., e.g.,
http://git.savannah.gnu.org/cgit/gawk.git/tree/awkgram.y#n3543), i.e.
without the artificial obstacle to now know the EOL type of the input file
ahead of running gawk.
> It works in all my cases. The only disadvantage: you have to know what
kind
... plus the disadvantage to systematically amend all the scripts instead of
having an external solution
> of files you want to handle in the awk script. The same awk script will
not
> work for DOS files as well as for linux files.
... another issue originated by the change and which didn’t exist before.
> Best
>
> Roger
Please don't get me wrong, but this raises a real issue here and I am not
sure which rationale other than 'let's get more of the Linux-feel' drove the
decision.
All the best,
J.
> -----Ursprüngliche Nachricht-----
> Von: cygwin-owner AT cygwin DOT com [mailto:cygwin-owner AT cygwin DOT com] Im
> Auftrag von Jannick
> Gesendet: Mittwoch, 9. August 2017 02:48
> An: cygwin AT cygwin DOT com
> Betreff: RE: gawk 4.1.4: CR separate char for CRLF files
>
> On Tue, 08 Aug 2017 16:23:40 -0700 (PDT), Steven Penny wrote:
> > On Wed, 9 Aug 2017 01:15:08, "Jannick" wrote:
> > > the current version 4.1.4 of gawk appears to unpleasantly treat CR
> > > for CRLF files, i.e. CR is not gracefully swallowed, but is a
> > > separate
> character.
> > >
> > > This makes some, if not all, of the scripts we are working with here
> > > useless, unless the input files are converted to LF which certainly
> > > is not feasible. IIRC the issue did not show up some versions back.
> > >
> > > Is this a bug - or am I missing something here?
> >
> > Learn to read:
> >
> > http://cygwin.com/ml/cygwin/2017-08/msg00033.html
>
> Thanks - quickly done.
>
> The link reveals that CRLF/LF conversion is now mandatory to work with
> cygwin's gawk on DOS machines. As far as I can see there is no legacy
> solution like for, e.g., sed (-b switch) to have an easy solution for the
issue,
> especially when invoking gawk from makefiles (piping).
>
> I consider this bad news while admittedly not fully understanding the
whole
> background of the move which is not necessary for now.
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
| webmaster | delorie software privacy |
| Copyright © 2019 by DJ Delorie | Updated Jul 2019 |