delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/10/22/21:43:37

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_PASS
X-Spam-Check-By: sourceware.org
Message-ID: <4AE10A39.8040005@comcast.net>
Date: Thu, 22 Oct 2009 21:43:21 -0400
From: "P.A.Long" <T DOT A DOT N DOT S DOT T DOT A DOT A DOT F DOT L AT comcast DOT net>
User-Agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090707)
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
CC: phillip DOT long AT gossinternational DOT com
Subject: Re: Re: gawk Has Problem With CRLF in Mixed Binary/Text Files
References: <4ADFC176 DOT 1060403 AT gmail DOT com>
In-Reply-To: <4ADFC176.1060403@gmail.com>
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

--------------060905010107000504020700
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Dave Korn wrote:
> t.a.n.s.t.a.a.f.l@ wrote:
> 
>> [ ... ] but as can be seen by the attached files, the downloaded 
>> gawk executable always changes CRLF to LF, 
> 
>   Is this what you're looking for?
> 
>> File: gawk.info,  Node: User-modified,  Next: Auto-set,  Up: Built-in Variables
>>
>> 6.5.1 Built-in Variables That Control `awk'
>> -------------------------------------------
>         [ ... snip ... ]
>> `BINMODE #'
>>      On non-POSIX systems, this variable specifies use of binary mode
>>      for all I/O.  Numeric values of one, two, or three specify that
>         [ ... snip ... ]
>>      but `gawk' generates a warning message.  `BINMODE' is described in
>>      more detail in *note PC Using::.
>         [ ... snip ... ]
> 
>   Or this?
> 
>> File: gawk.info,  Node: PC Using,  Next: Cygwin,  Prev: PC Dynamic,  Up: PC Ins\
>> tallation
>>
>> B.3.3.4 Using `gawk' on PC Operating Systems
>> ............................................
>         [ ... snip ... ]
>>    Under OS/2 and DOS, `gawk' (and many other text programs) silently
>> translate end-of-line `"\r\n"' to `"\n"' on input and `"\n"' to
>> `"\r\n"' on output.  A special `BINMODE' variable allows control over
>> these translations and is interpreted as follows:
>         [ ... continues ... ]
> 
> 
>     cheers,
>       DaveK
> 

Dave:

Didn't work (see attached file).  Of course, I could have set BINMODE 
the wrong way, but I used the -v method to make sure that it got set 
before the BEGIN action.  If U see anything wrong with what I've done, 
please tell me; I'm not too proud to admit silly mistakes!

And speaking of silly mistakes, I should have looked at the manpage for 
gawk *and* at the documentation for Cygwin before I posted; I'll do that 
tomorrow.  I'll post anything I find; in the meantime, at the suggestion 
of David Dyck, I'll be running a2p on my gawk script and using perl instead.

					Thx, Phil Long
  << File:  cygwinGawk-withBINMODEset.stillNotWorking >>



--------------060905010107000504020700
Content-Type: text/plain;
 name="cygwinGawk-withBINMODEset.stillNotWorking"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="cygwinGawk-withBINMODEset.stillNotWorking"

/work> 
/work> 
/work> 
/work> gawk 'BEGIN{print BINMODE}' BINMODE=3                         # just pressed <ENTER> 
0
/work> gawk -v BINMODE=3 'BEGIN{print BINMODE};END{print BINMODE}'   # pressed C-d 
3
3
/work> echo -en "1111\r2222\n3333\r\n4444\n\r5555"                                        | xxd -g 1 -u # UH-oh!  
0000000: 31 31 31 31 0D 32 32 32 32 0A 33 33 33 33 0D 0A  1111.2222.3333..
0000010: 34 34 34 34 0A 0D 35 35 35 35                    4444..5555
/work> echo -en "1111\r2222\n3333\r\n4444\n\r5555" | gawk -v BINMODE=3 '//' {O,}RS="xxxx" | xxd -g 1 -u 
0000000: 31 31 31 31 0D 32 32 32 32 0A 33 33 33 33 0D 0A  1111.2222.3333..
0000010: 34 34 34 34 0A 0D 35 35 35 35 78 78 78 78        4444..5555xxxx
/work> echo -en "1111\r2222\n3333\r\n4444\n\r5555" | gawk -v BINMODE=3 '//' {O,}RS="\r\n" | xxd -g 1 -u 
0000000: 31 31 31 31 0D 32 32 32 32 0A 33 33 33 33 0D 0A  1111.2222.3333..
0000010: 34 34 34 34 0A 0D 35 35 35 35 0D 0A              4444..5555..
/work> echo -en "1111\r2222\n3333\r\n4444\n\r5555" | gawk -v BINMODE=3 '//' {O,}RS="\n\r" | xxd -g 1 -u 
0000000: 31 31 31 31 0D 32 32 32 32 0A 33 33 33 33 0D 0A  1111.2222.3333..
0000010: 34 34 34 34 0A 0D 35 35 35 35 0A 0D              4444..5555..
/work> echo -en "1111\r2222\n3333\r\n4444\n\r5555" | gawk -v BINMODE=3 '//' {O,}RS="\r"   | xxd -g 1 -u 
0000000: 31 31 31 31 0D 32 32 32 32 0A 33 33 33 33 0D 0A  1111.2222.3333..
0000010: 34 34 34 34 0A 0D 35 35 35 35 0D                 4444..5555.
/work> echo -en "1111\r2222\n3333\r\n4444\n\r5555" | gawk -v BINMODE=3 '//' {O,}RS="\n"   | xxd -g 1 -u 
0000000: 31 31 31 31 0D 32 32 32 32 0A 33 33 33 33 0D 0A  1111.2222.3333..
0000010: 34 34 34 34 0A 0D 35 35 35 35 0A                 4444..5555.
/work> 
/work> 
/work> 


--------------060905010107000504020700
Content-Type: text/plain; charset=us-ascii

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
--------------060905010107000504020700--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019