X-Authentication-Warning: delorie.com: mail set sender to djgpp-bounces using -f From: Simon Tatham Newsgroups: comp.os.msdos.djgpp Subject: Bug in mbstowcs() in DJGPP's C library Date: 21 Jun 2007 16:08:00 +0100 (BST) Organization: Yeah, right Lines: 74 Message-ID: NNTP-Posting-Host: rapun.sel.cam.ac.uk X-Newsreader: trn 4.0-test75 (Feb 13, 2001) Originator: @tunnel.ixion.tartarus.org ([172.31.80.2]) To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com [I tried to send this bug report to to as instructed by http://www.delorie.com/djgpp/why.html , but that page neglected to mention that my email would be rejected if I hadn't subscribed to the DJGPP mailing list first. I'm posting it here in preference to subscribing unnecessarily to a mailing list.] I believe mbstowcs() in DJGPP's C library is behaving incorrectly. It appears to be unwilling to fill its entire output buffer with non-zero wide characters, preferring to stop one character short of the end of the buffer and write a NUL. This sounds sensible, but it's not what the C standard requires. Here's a small C program, mbstest.c, which demonstrates the problem: #include #include int main(void) { wchar_t out[20]; int i, ret; ret = mbstowcs(out, "hello, world", 12); for (i = 0; i < ret; i++) printf("%4d", out[i]); printf("\n"); return 0; } If I compile this file using DJGPP, with the obvious command line gcc -o mbstest.exe mbstest.c and run it, I get the output 104 101 108 108 111 44 32 119 111 114 108 indicating that the mbstowcs() call has written L"hello, worl\0" into the 12-wide-character buffer. However, if I try compiling and running the same program with Cygwin's C compiler, or Microsoft Visual Studio, or using gcc on Linux, then all of them produce the output 104 101 108 108 111 44 32 119 111 114 108 100 indicating that mbstowcs() has written L"hello, world" with no trailing NUL. My reading of C99 is that all those other compilers' C libraries are right and DJGPP's is wrong. C99 simply states that mbstowcs() converts the input string into wide characters, writes not more than n of them into the output buffer, and _if_ it sees a NUL byte in the process then it converts it into a NUL wide character and stops. There's nothing there to suggest that it should be inventing NULs where none existed in the input. As a result of this, a user just told me, version 1.0 of Halibut (http://www.chiark.greenend.org.uk/~sgtatham/halibut/) does not run correctly when compiled with DJGPP. I've committed a change for the next release which was desirable for other reasons and which works around this problem, but I do think it's a bug in DJGPP's libc. Also, while I'm here: the Zip Picker doesn't seem to be giving me correct URLs for ftp.delorie.com. If I go to www.delorie.com/djgpp/, follow the link to the Zip Picker, and immediately click the `Tell me which files I need' button, it gives me a bunch of links to pathnames such as ftp://ftp.delorie.com/pub/djgpp/v2/readme.1st . But this file gives me a 550 error when I try to retrieve it; the correct URL appears to be ftp://ftp.delorie.com/pub/djgpp/current/v2/readme.1st . -- for k in [pow(x,37,0x13AC59F3ECAC3127065A9) for x in [0x195A0BCE1C2F0310B43C, 0x73A0CE584254AB23D5A0, 0x12878657EA814421CC92, 0x7373445BB3DA69996F4A, 0x77A7ED5BC3AA700E80B2, 0xE9C71C94ED87ADCF7367, 0xFE920395F414C1A5DB50]]: print "".join([chr(32+3*((k>>x)&1))for x in range(79)]) #