X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-0.8 required=5.0 tests=AWL,BAYES_20,RCVD_IN_DNSWL_NONE X-Spam-Check-By: sourceware.org Message-ID: <1309.192.168.6.58.1296044105.squirrel@simlinux> Date: Wed, 26 Jan 2011 13:15:05 +0100 (CET) Subject: Re: Bug in libiconv? From: simrw AT sim-basis DOT de To: cygwin AT cygwin DOT com User-Agent: SquirrelMail/1.4.5 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-SIMBasis-MailScanner-Information: Please contact the ISP for more information X-SIMBasis-MailScanner: Found to be clean X-SIMBasis-MailScanner-SpamCheck: not spam, SpamAssassin (score=-5.892, required 5, autolearn=not spam, ALL_TRUSTED -3.30, BAYES_00 -2.60, NO_REAL_NAME 0.01) X-SIMBasis-MailScanner-From: simrw AT sim-basis DOT de X-MAIL-FROM: X-Loop-Detect:1 Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com > Here's what happens on Cygwin: > > $ gcc -g -o ic ic.c -liconv > $ ./ic > iconv: 138 > in = , inbuf = <ä sana>, inbytesleft = 7, outbytesleft = 492 > iconv: 138 > in = , inbuf = <ä sana>, inbytesleft = 7, outbytesleft = 492 > iconv: 138 > in = , inbuf = <ä sana>, inbytesleft = 7, outbytesleft = 492 > in = , inbuf = <>, inbytesleft = 0, outbytesleft = 480 > > So, AFAICS, there are two problems: > > - Even though iconv_open has been opened explicitely with "UTF-8" as > input string, the conversion still depends on the current application > codeset. That dsoesn't make sense. > > - Even though the last parameter to iconv is defined in bytes, the > value of outbytesleft after the conversion is the number of remaining > wchar"t's, not the number of remaining bytes. That's contrary to > what POSIX defines, see > http://pubs.opengroup.org/onlinepubs/9699919799/functions/iconv.html IMHO, the count is correct. On Windows/Cygwin, wchar_t is 2 bytes, on Linux, 4 bytes. So the buffer is 512 bytes. In the first 3 cases, 10 input bytes were consumed so that there remains in the buffer (512 - 20) = 492 bytes. In the last case all 16 bytes are consumed so there remains in the buffer (512 - 32) = 480 bytes. Roger -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple