delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2000/01/17/14:18:36

Message-ID: <F77915E7F086D31197F4009027CC81C91CC4DA@probe-2.as-london.acclaim.com>
From: Shawn Hargreaves <SHargreaves AT acclaimstudios DOT co DOT uk>
To: djgpp AT delorie DOT com
Subject: Re: Allegro, Ansi, TTF2PCX and Umlauts
Date: Mon, 17 Jan 2000 16:46:22 -0000
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Reply-To: djgpp AT delorie DOT com

Manni Heumann writes:
> I guess the problem is, that Allegro uses Ascii codepages (I called 
> set_uformat (U_ASCII)), while the windows fonts are based on an Ansi 
> representation.

I'll assume that you are using a 3.9.x work-in-progress version of
Allegro, which include Unicode support. If you aren't, it would be
a good idea to upgrade, as 3.12 doesn't have nearly such good
internationalisation support.

(cue short lecture about text encoding formats)

Internally, Allegro uses Unicode format text. This can support all 
the characters needed for any of the major languages (including 
Chinese, Japanese, etc), and uses character values ranging from 
0-65536. See www.unicode.org for tables of what character goes where.

Obviously, there are too many different Unicode characters for you
to store them all in normal char variables. So you have a choice of
many different ways to encode the letters into a string, and can
call set_uformat() to choose which method you would prefer to use.
You could use U_UNICODE, where each character is a 16 bit value,
or U_ASCII, where each character is only 8 bits (so you can only
store letters from 0-255), or the default, U_UTF8, where characters
from 0-127 are stored directly as 8 bit values, and values from 128
to 65535 are encoded as two or more bytes. This method is cool
because it's fairly backward compatible with normal ASCII code, but
easily allows you to support all sorts of different character sets
needed for other parts of the world.

As long as you use only Allegro functions, that's all you need to
know. The text printing functions draw strings from whatever encoding
format you have selected, and the input functions return characters
in the same style. You do need to be careful when manipulating strings
in U_UNICODE or U_UTF8 format, though, as you can't just read 
individual bytes out of a char array when the characters might be
more than one byte wide: you have to use the Allegro functions like
ugetc(), ugetat(), etc, instead.

The problem comes when you want to talk to the outside world, such
as using strings that you typed into your text editor. Here, it really
all depends on what text format your editor is using. At least for
most European countries, Windows and Unix systems will tend to be
using the Latin-1 codepage, which is the same thing as the first 256
characters of Unicode. You could use this text directly with Allegro
in U_ASCII mode, or run it through the textconv program if you want
to convert it into U_UTF8 format. If you are using a DOS editor, 
though, you are in trouble: DOS can use many different character
layouts depending what country you are in, and Allegro doesn't know
anything about these. You could find a table to convert whatever
format you are using into Unicode, and then use the Allegro U_ASCII_CP
mode to convert all your text using that table, but I really don't
recommend this because it's very inefficient, and also won't work
correctly for other countries that use different DOS codepages.

If you need to edit strings that use character values above 127, the 
best method, IMHO, is to get a Unicode-aware editor so you can
create this data directly in UTF-8 format: there are some links on
the Allegro utilities page. Failing that, use a program that edits
Latin-1 format text files, and then use textconv to convert the 
results into UTF-8 format before using them with Allegro. If you
absolutely insist, you could downgrade Allegro by specifying U_ASCII
mode, but then your program will be unable to deal with texts that
use a non-Latin alphabet, so I don't recommend it.


	Shawn Hargreaves.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019