X-Recipient: archive-cygwin@delorie.com
X-SWARE-Spam-Status: No, hits=-2.6 required=5.0	tests=AWL,BAYES_00,KHOP_THREADED,SPF_HELO_PASS,TW_VM,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
To: cygwin@cygwin.com
From: Ralf <wiesweg@tacos-gmbh.de>
Subject: Re: length in gawk returns wrong value
Date: Thu, 19 Jul 2012 11:27:01 +0000 (UTC)
Lines: 57
Message-ID: <loom.20120719T131247-62@post.gmane.org>
References: <loom.20120719T103849-659@post.gmane.org> <20120719092024.GA31055@calimero.vinschen.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
User-Agent: Loom/3.14 (http://gmane.org/)
X-IsSubscribed: yes
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com

Corinna Vinschen <corinna-cygwin <at> cygwin.com> writes:

> 
> Uh oh.  1.7.9 is old.  Please update.
> 
> > 0000000   R 374   c   k   e   n  \r  \n
> > 0000010
> > Length: 1
> > 
> > What can I do to get the correct length in gawk without changing
> > ttt.txt?
> 
> Dunno.  This is not what I see.  What did you have $LANG and $LC_CTYPE
> set to?  Here's what I see:
> 
>   $ uname -a
>   CYGWIN_NT-6.1 vmbert7 1.7.16(0.261/5/3) 2012-07-09 14:51 i686 Cygwin
> 
>   $ echo $LANG
>   C.UTF-8
> 
>   $ echo "Rücken" > ttt.txt
>   $ od -c ttt.txt
>   0000000   R 303 274   c   k   e   n  \n
>   0000010
> 
>   $ gawk '{print "Length: " length($0)}' ttt.txt
>   Length: 6
> 
>   $ gawk --version | head -1
>   GNU Awk 4.0.1
> 
> Corinna
> 

After updating I added following lines on top of my script:
 export LANG=C.UTF-8
 echo LANG: $LANG
 echo LC_CTYPE: $LC_TYPE
 c:/unix/bin/gawk --version | head -1

And this is my output:
 LANG: C.UTF-8
 LC_CTYPE:
 GNU Awk 4.0.1
 CYGWIN_NT-6.0-WOW64 WIESWEG 1.7.15(0.260/5/3) 2012-05-09 10:25 i686 Cygwin
 0000000   R 374   c   k   e   n  \r  \n
 0000010
 Length: 5

Very strange!

But after adding
 export LC_CTYPE=C
I got the correct result.

Thanks for your quick help!


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

