Mail Archives: cygwin/2011/01/26/07:15:39
X-Recipient: | archive-cygwin AT delorie DOT com
|
X-SWARE-Spam-Status: | No, hits=-0.8 required=5.0 tests=AWL,BAYES_20,RCVD_IN_DNSWL_NONE
|
X-Spam-Check-By: | sourceware.org
|
Message-ID: | <1309.192.168.6.58.1296044105.squirrel@simlinux>
|
Date: | Wed, 26 Jan 2011 13:15:05 +0100 (CET)
|
Subject: | Re: Bug in libiconv?
|
From: | simrw AT sim-basis DOT de
|
To: | cygwin AT cygwin DOT com
|
User-Agent: | SquirrelMail/1.4.5
|
MIME-Version: | 1.0
|
X-SIMBasis-MailScanner-Information: | Please contact the ISP for more information
|
X-SIMBasis-MailScanner: | Found to be clean
|
X-SIMBasis-MailScanner-SpamCheck: | not spam, SpamAssassin (score=-5.892, required 5, autolearn=not spam, ALL_TRUSTED -3.30, BAYES_00 -2.60, NO_REAL_NAME 0.01)
|
X-SIMBasis-MailScanner-From: | simrw AT sim-basis DOT de
|
X-MAIL-FROM: | <simrw AT sim-basis DOT de>
|
X-Loop-Detect:1
: | |
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm
|
List-Id: | <cygwin.cygwin.com>
|
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com>
|
List-Archive: | <http://sourceware.org/ml/cygwin/>
|
List-Post: | <mailto:cygwin AT cygwin DOT com>
|
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
|
Sender: | cygwin-owner AT cygwin DOT com
|
Mail-Followup-To: | cygwin AT cygwin DOT com
|
Delivered-To: | mailing list cygwin AT cygwin DOT com
|
> Here's what happens on Cygwin:
>
> $ gcc -g -o ic ic.c -liconv
> $ ./ic
> iconv: 138 <Invalid or incomplete multibyte or wide character>
> in = <Liian pitkä sana>, inbuf = <ä sana>, inbytesleft = 7,
outbytesleft = 492
> iconv: 138 <Invalid or incomplete multibyte or wide character>
> in = <Liian pitkä sana>, inbuf = <ä sana>, inbytesleft = 7,
outbytesleft = 492
> iconv: 138 <Invalid or incomplete multibyte or wide character>
> in = <Liian pitkä sana>, inbuf = <ä sana>, inbytesleft = 7,
outbytesleft = 492
> in = <Liian pitkä sana>, inbuf = <>, inbytesleft = 0, outbytesleft = 480
>
> So, AFAICS, there are two problems:
>
> - Even though iconv_open has been opened explicitely with "UTF-8" as
> input string, the conversion still depends on the current application
> codeset. That dsoesn't make sense.
>
> - Even though the last parameter to iconv is defined in bytes, the
> value of outbytesleft after the conversion is the number of remaining
> wchar"t's, not the number of remaining bytes. That's contrary to
> what POSIX defines, see
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/iconv.html
IMHO, the count is correct.
On Windows/Cygwin, wchar_t is 2 bytes, on Linux, 4 bytes.
So the buffer is 512 bytes.
In the first 3 cases, 10 input bytes were consumed so that there remains
in the buffer (512 - 20) = 492 bytes.
In the last case all 16 bytes are consumed so there remains in
the buffer (512 - 32) = 480 bytes.
Roger
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
- Raw text -