delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2001/08/06/13:43:09

From: "Juan Manuel Guerrero" <ST001906 AT HRZ1 DOT HRZ DOT TU-Darmstadt DOT De>
Organization: Darmstadt University of Technology
To: JT Williams <jeffw AT darwin DOT sfbr DOT org>, Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>,
salvador <salvador AT inti DOT gov DOT ar>
Date: Mon, 6 Aug 2001 19:22:00 +0200
Subject: Re: gettext port
CC: djgpp-workers AT delorie DOT com
X-mailer: Pegasus Mail for Windows (v2.54DE)
Message-ID: <378E2A966FC@HRZ1.hrz.tu-darmstadt.de>
Reply-To: djgpp-workers AT delorie DOT com

On Sun, 5 Aug 2001 08:41:26 -0500, JT Williams wrote:
> -: > In the `sed.mo' file for german I see that u" is represented
> -: > by ascii 252 and o" by ascii 246.  But this is not correct for
> -: > either cp437 or cp850 (u" is 129 and o" is 148 in each).
> -: 
> -: See the Content-type header of the file: it probably says that the
> -: file is in ISO-8859-1.  The conversion to cp850 is done on the fly by
> -: libiconv, since on MS-DOS, the default for de locale is cp850.
>
> Using cp850 does not help, because as far as the display of german text is
> concerned, cp437 and cp850 are equivalent.  In fact, given the character
> encoding of `sed.mo', *none* of the six codepages supplied with DOS 5 can
> correctly display the german text from `sed.mo'.
>
> I have read the detailed post by Juan several times, but I still cannot
> determine if the above would indicate that something is broken, or just
> not possible under djgpp.

I have reinspected the sources of libintl.a to recall all the things i have
forgotten about this issue. The interesting function is localcharset.c:locale_charset().
This function tries to determinate the locale charset to be used by calling function:
nl_langinfo() if available. If that function is not available, the locale charset is
determinated by checking the environment variables: LC_ALL, LC_CTYPE and LANG in that order.
The interesting variable is LANG. This variable may contain an alias like es for spanish or
de_CH for german spoken in switzerland. This alias is resolved into a codepage using
charset.alias. At the same time LANG can be set directely to a codepage. This means,
it is possible to set LANG=437. At least the following ways to specify a codepage directely
using LANG are allowed AFAIK:
  LANG=437
  LANG=CP437
  LANG=cp437
All those LANG settings are ok for codepage 437. Of course, the same applies for all the
other codepages. In this particular case the .mo file will be recoded to codepage 437.
To solve your difficulty I would suggest the following lines for your djgpp.env:

LANG=CP437
LANGUAGE=de

The first line selects the appropiate locale charset to be used during runtime recoding.
The second line is evaluated by function dcigettext.c:dcigettext() and is used to
build the path to the .mo file containing the translated strings.
Btw, something like LANGUAGE=de:en make no much sense. Usualy there is no en (english)
subdir in the share/locale tree because the english strings are _always_ in the binaries
and the english strings are used by default if the translations can not be found.

I have tested this with the binaries of gtxt039b.zip, recode35b.zip and sed3028b.zip
in the cases that CP850 or CP437 is loaded (MSDOS 6.22). This works fine for me.
There is nothing broken neither in gtxt039, licv17 nor in sed3028.

It should be noticed that it is possible to set LANG=CP866 to overwrite the setting: ru_RU KOI8-R.

Regards,
Guerrero, Juan M.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019