X-Recipient: archive-cygwin@delorie.com
X-Spam-Check-By: sourceware.org
Date: Wed, 26 Jan 2011 14:26:13 +0100
From: Corinna Vinschen <corinna-cygwin@cygwin.com>
To: cygwin@cygwin.com
Subject: Re: Bug in libiconv?
Message-ID: <20110126132613.GN28470@calimero.vinschen.de>
Reply-To: cygwin@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
References: <1309.192.168.6.58.1296044105.squirrel@simlinux>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <1309.192.168.6.58.1296044105.squirrel@simlinux>
User-Agent: Mutt/1.5.21 (2010-09-15)
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
Precedence: bulk
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie.com@cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com

On Jan 26 13:15, simrw@sim-basis.de wrote:
> > Here's what happens on Cygwin:
> >
> > $ gcc -g -o ic ic.c -liconv
> > $ ./ic
> > iconv: 138 <Invalid or incomplete multibyte or wide character>
> > in = <Liian pitkÃ¤ sana>, inbuf = <Ã¤ sana>, inbytesleft = 7,
> outbytesleft = 492
> >   iconv: 138 <Invalid or incomplete multibyte or wide character>
> >   in = <Liian pitkÃ¤ sana>, inbuf = <Ã¤ sana>, inbytesleft = 7,
> outbytesleft = 492
> >   iconv: 138 <Invalid or incomplete multibyte or wide character>
> >   in = <Liian pitkÃ¤ sana>, inbuf = <Ã¤ sana>, inbytesleft = 7,
> outbytesleft = 492
> >   in = <Liian pitkÃ¤ sana>, inbuf = <>, inbytesleft = 0, outbytesleft = 480
> >
> > So, AFAICS, there are two problems:
> >
> >   - Even though iconv_open has been opened explicitely with "UTF-8" as
> >     input string, the conversion still depends on the current application
> >     codeset.  That dsoesn't make sense.
> >
> >   - Even though the last parameter to iconv is defined in bytes, the
> >     value of outbytesleft after the conversion is the number of remaining
> >     wchar"t's, not the number of remaining bytes.  That's contrary to
> > what POSIX defines, see
> > http://pubs.opengroup.org/onlinepubs/9699919799/functions/iconv.html
> 
> IMHO, the count is correct.
> On Windows/Cygwin, wchar_t is 2 bytes, on Linux, 4 bytes.
> So the buffer is 512 bytes.
> In the first 3 cases, 10 input bytes were consumed so that there remains
> in the buffer (512 - 20) = 492 bytes.
> In the last case all 16 bytes are consumed so there remains in
> the buffer (512 - 32) = 480 bytes.

Yes, you're right.  Quite obviously I misinterpreted the results without
realizing that the buffer is smaller under Cygwin.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

