Mail Archives: cygwin/2009/05/15/05:51:31
On May 15 13:30, Alexey Borzenkov wrote:
> [...]
> It appears that there's a bug in printf with %ls that
> will refuse to print the string completely if the wide string for %ls
> cannot be represented in current charset. It's interesting that
> sometimes it behaves differently. For example:
>
> $ mkpasswd -C
> NDGAMES\aborzenkov:unused:11721:10513:U-NDGAMES\aborzenkov,*sidremoved*:/home/aborzenkov:/bin/bash
> $ mkgroup -C
> NDGAMES\
>
> Notice that in the second case it somehow managed to print domain name
> and separator before failing.
>
> Another example:
>
> #include <stdio.h>
> #include <locale.h>
>
> int main(int argc, char** argv)
> {
> setlocale(LC_ALL, "en_US.CP1252");
> printf("'%ls'", L"\u0410\u0411\u0412");
> return 0;
> }
>
> Prints nothing, i.e. it doesn't print neither of single quotes. If it
> couldn't represent those characters, I think it should either ignore
> them, or try to display them with SO-UTF-8. Making printf call fail
> like that is, imho, really unexpected.
printf must not decide by itself over the charset to use for the widechar
to multibyte conversion. If you run the same on Linux, you also get a
broken output. It only manages to print the leading quoting char. It
does not print the second quoting char, because the mbtowc conversion
failed. If you check the return code of printf, you see why:
if (printf("'%ls'xxx", L"\u0410\u0411\u0412") < 0)
perror ("\nprintf");
prints "printf: Invalid or incomplete multibyte or wide character"
on Linux as well as on Cygwin.
I'll change mkgroup and mkpasswd to call setlocale and to fall back to
UTF-8 if the locale is "C".
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -