Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <000301c2d9fa$3bd8a3a0$5c16989e@oemcomputer> Reply-To: "Peter S Tillier" From: "Peter S Tillier" To: "Pieter Prinsloo" , "Cygwin" References: <3E540088 DOT 000005 DOT 01320 AT pexy> Subject: Re: problem report: gawk 3.1.1 Date: Fri, 21 Feb 2003 22:40:17 -0000 Organization: Private X-Priority: 3 X-MSMail-Priority: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Pieter Prinsloo wrote: > Hi. > > Have a query/problem with gawk version 3.1.1-5 dated 17/Oct/2002. > (allthough the problem as stated is for cygwin - it can also be > emulated in Linux > with gawk 3.1.0) > > Given the following example > ==ort awk program file={ > one =printf("%s",$1); > two =printf("%s",$2); > printf("LEFT=:right=:\n",$1,$2); > printf("left=:right=:\n",one,two); > } > = The above shouldn't even compile in awk. awk isn't C or perl, and printf() doesn't return a value (nor do you need the semicolons). Did you mean sprintf()? It is usually best to use copy and paste to put your bash session into questions of this sort. I took your awk program and changed it to use sprintf(): { one = sprintf("%s",$1); two = sprintf("%s",$2); printf("LEFT=:right=:\n",$1,$2); printf("left=:right=:\n",one,two); } Even so, your code will only ever print out LEFT=:right=: left=:right=: LEFT=:right=: left=:right=: LEFT=:right=: left=:right=: with the sample file that you quote because the printf calls don't specify "%s" anywhere. Correcting this to: { one = sprintf("%s",$1); two = sprintf("%s",$2); printf("LEFT=: %s right=: %s\n",$1,$2); printf("left=: %s right=: %s\n",one,two); } gives this: LEFT=: left side of record 1 right=: right side of record left=: left side of record 1 right=: right side of record LEFT=: left side of record 2 right=: right side of record left=: left side of record 2 right=: right side of record LEFT=: left side of record 3 right=: right side of record left=: left side of record 3 right=: right side of record as its output. I suggest that you take a look at the gawk manual and read the section about the BINMODE variable, which is used to determine how gawk deals with line end conversion. Remember that, in general, Cygwin sets up a UNIX-like file system and file handling so the line terminator is expected to be "\n", whereas "MSDOS"-created files have "\r\n" line terminators. Under UNIX if you pass an MSDOS line terminated file to gawk it will not treat "\r\n" as the terminator, but "\n". In other words each line read will end with a "\r" and this may cause output lines to overwrite all or part of earlier ones. Something like this: $ echo -e "a b c \r" | gawk '{print NF}' 4 $ is correct behaviour because the "\r" character represents a separate field. And notice that $ echo -e "a b c \r" | gawk '{print $0, NF}' 4b c $ is correct under UNIX/Linux. In this case the "\r" causes the OFS space and the digit 4 to overwrite the "a " of "a b c ". By using the sed command that you mention you are using it to remove the "\r" characters. Note, in the above I'm assuming that you are using a full Cygwin installation including bash. You will possibly get different results if you run gawk from a DOS box under Windows. HTH -- Peter S Tillier "Who needs perl when you can write dc and sokoban in sed?" __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/