delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/12/29/06:21:54

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SPF_PASS
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
In-Reply-To: <380-220091222910461682@cantv.net>
References: <380-220091222910461682 AT cantv DOT net>
Date: Tue, 29 Dec 2009 11:21:41 +0000
Message-ID: <416096c60912290321h5552a067m294925a08c08ce41@mail.gmail.com>
Subject: Re: gcc4[1.7] printf treats differently a string constant and a character array
From: Andy Koppe <andy DOT koppe AT gmail DOT com>
To: Cygwin Tech List <cygwin AT cygwin DOT com>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

2009/12/29 Rodrigo Medina:
>>Ah, the problem actually is that your program is missing a call to
>>setlocale(LC_CTYPE, "") to switch to the locale and character set
>>specified in the environment...
>
> That worked!, but what that means is that if one wants to
> use any locale other than C.UTF-8, one has, not only to compile again the
> programs , but also to modify them. Perhaps the best thing to do
> is to read the LC_ALL variable from the environment and then call
> setlocale.

setlocale(LC_CTYPE, "") already does that. It tries to read LC_ALL,
LC_CTYPE, and LANG, in that order, and only if none of them are set it
falls back to the default locale: C.UTF-8.

'char' string constants with non-ASCII characters are not a good idea
if the program is supposed to work with different charsets, because
they're encoded in one particular charset, namely that of your editor.

In that case you need to use wchar_t strings instead, for example:

#include <locale.h>
#include <wchar.h>

int main(void) {
  setlocale(LC_CTYPE, "");
  wprintf(L"=C3=98l\n");
}

You also need to ensure that gcc's character set matches the source
file's; otherwise gcc won't encode the wide string correctly.

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019