delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/05/15/05:30:51

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-0.0 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43,SARE_MSGID_LONG40,SPF_PASS
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
Date: Fri, 15 May 2009 13:30:12 +0400
Message-ID: <e2480c70905150230y595b5796wa79c5b34df707fbf@mail.gmail.com>
Subject: Re: [1.7] bug in printf and %ls
From: Alexey Borzenkov <snaury AT gmail DOT com>
To: cygwin AT cygwin DOT com
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Fri, May 15, 2009 at 11:43 AM, Alexey Borzenkov <snaury AT gmail DOT com> wrote:
> I'm in a domain at work and previously used mkpasswd -d and mkgroup -d
> to populate /etc/passwd and /etc/group files. Unfortunately, we mostly
> use Russian versions of Windows (especially on servers) here and most
> built-in user and group names (like Administrator, Domain Users, etc.)
> are localized. With cygwin 1.5 these names were successfully exported
> by mkpasswd/mkgroup, however with cygwin 1.7 all such usernames are
> silently ignored and don't appear in the output.

And I found why. It appears that there's a bug in printf with %ls that
will refuse to print the string completely if the wide string for %ls
cannot be represented in current charset. It's interesting that
sometimes it behaves differently. For example:

$ mkpasswd -C
NDGAMES\aborzenkov:unused:11721:10513:U-NDGAMES\aborzenkov,*sidremoved*:/home/aborzenkov:/bin/bash
$ mkgroup -C
NDGAMES\

Notice that in the second case it somehow managed to print domain name
and separator before failing.

Another example:

#include <stdio.h>
#include <locale.h>

int main(int argc, char** argv)
{
  setlocale(LC_ALL, "en_US.CP1252");
  printf("'%ls'", L"\u0410\u0411\u0412");
  return 0;
}

Prints nothing, i.e. it doesn't print neither of single quotes. If it
couldn't represent those characters, I think it should either ignore
them, or try to display them with SO-UTF-8. Making printf call fail
like that is, imho, really unexpected.

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019