delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2012/07/19/04:50:57

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
To: cygwin AT cygwin DOT com
From: Ralf <wiesweg AT tacos-gmbh DOT de>
Subject: length in gawk returns wrong value
Date: Thu, 19 Jul 2012 08:50:21 +0000 (UTC)
Lines: 18
Message-ID: <loom.20120719T103849-659@post.gmane.org>
Mime-Version: 1.0
User-Agent: Loom/3.14 (http://gmane.org/)
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

The following lines create a file named ttt.txt. The file ttt.txt contains
exactly what I want (oct 374 for the umlaut u). But if you look at the output of
these lines you can see that the function length() of gawk can not handle this
character:

uname -a
echo "Rücken" > ttt.txt
od -c ttt.txt
gawk '{print "Length: " length($0)}' ttt.txt

Output:
CYGWIN_NT-6.0-WOW64 WIESWEG 1.7.9(0.237/5/3) 2011-03-29 10:10 i686 Cygwin
0000000   R 374   c   k   e   n  \r  \n
0000010
Length: 1

What can I do to get the correct length in gawk without changing the contents of
ttt.txt?


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019