delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2007/06/21/11:30:52

X-Authentication-Warning: delorie.com: mail set sender to djgpp-bounces using -f
From: Simon Tatham <anakin AT pobox DOT com>
Newsgroups: comp.os.msdos.djgpp
Subject: Bug in mbstowcs() in DJGPP's C library
Date: 21 Jun 2007 16:08:00 +0100 (BST)
Organization: Yeah, right
Lines: 74
Message-ID: <ioF*uuQNr@news.chiark.greenend.org.uk>
NNTP-Posting-Host: rapun.sel.cam.ac.uk
X-Newsreader: trn 4.0-test75 (Feb 13, 2001)
Originator: @tunnel.ixion.tartarus.org ([172.31.80.2])
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

[I tried to send this bug report to to <djgpp AT delorie DOT com> as
instructed by http://www.delorie.com/djgpp/why.html , but that page
neglected to mention that my email would be rejected if I hadn't
subscribed to the DJGPP mailing list first. I'm posting it here in
preference to subscribing unnecessarily to a mailing list.]

I believe mbstowcs() in DJGPP's C library is behaving incorrectly.
It appears to be unwilling to fill its entire output buffer with
non-zero wide characters, preferring to stop one character short of
the end of the buffer and write a NUL. This sounds sensible, but
it's not what the C standard requires.

Here's a small C program, mbstest.c, which demonstrates the problem:

#include <stdio.h>
#include <stdlib.h>
int main(void) {
    wchar_t out[20];
    int i, ret;
    ret = mbstowcs(out, "hello, world", 12);
    for (i = 0; i < ret; i++)
        printf("%4d", out[i]);
    printf("\n");
    return 0;
}

If I compile this file using DJGPP, with the obvious command line

  gcc -o mbstest.exe mbstest.c

and run it, I get the output

 104 101 108 108 111  44  32 119 111 114 108

indicating that the mbstowcs() call has written L"hello, worl\0"
into the 12-wide-character buffer.

However, if I try compiling and running the same program with
Cygwin's C compiler, or Microsoft Visual Studio, or using gcc on
Linux, then all of them produce the output

 104 101 108 108 111  44  32 119 111 114 108 100

indicating that mbstowcs() has written L"hello, world" with no
trailing NUL.

My reading of C99 is that all those other compilers' C libraries are
right and DJGPP's is wrong. C99 simply states that mbstowcs()
converts the input string into wide characters, writes not more than
n of them into the output buffer, and _if_ it sees a NUL byte in the
process then it converts it into a NUL wide character and stops.
There's nothing there to suggest that it should be inventing NULs
where none existed in the input.

As a result of this, a user just told me, version 1.0 of Halibut
(http://www.chiark.greenend.org.uk/~sgtatham/halibut/) does not run
correctly when compiled with DJGPP. I've committed a change for the
next release which was desirable for other reasons and which works
around this problem, but I do think it's a bug in DJGPP's libc.

Also, while I'm here: the Zip Picker doesn't seem to be giving me
correct URLs for ftp.delorie.com. If I go to www.delorie.com/djgpp/,
follow the link to the Zip Picker, and immediately click the `Tell
me which files I need' button, it gives me a bunch of links to
pathnames such as ftp://ftp.delorie.com/pub/djgpp/v2/readme.1st .
But this file gives me a 550 error when I try to retrieve it; the
correct URL appears to be
ftp://ftp.delorie.com/pub/djgpp/current/v2/readme.1st .

-- 
for k in [pow(x,37,0x13AC59F3ECAC3127065A9) for x in [0x195A0BCE1C2F0310B43C,
0x73A0CE584254AB23D5A0, 0x12878657EA814421CC92, 0x7373445BB3DA69996F4A,
0x77A7ED5BC3AA700E80B2, 0xE9C71C94ED87ADCF7367, 0xFE920395F414C1A5DB50]]:
 print "".join([chr(32+3*((k>>x)&1))for x in range(79)]) # <anakin AT pobox DOT com>

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019