delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2001/08/07/04:58:10

Date: Tue, 7 Aug 2001 10:56:46 +0300 (IDT)
From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
X-Sender: eliz AT is
To: salvador <salvador AT inti DOT gov DOT ar>
cc: Juan Manuel Guerrero <ST001906 AT HRZ1 DOT HRZ DOT TU-Darmstadt DOT De>,
djgpp-workers AT delorie DOT com
Subject: Re: gettext port
In-Reply-To: <3B6EDA4E.B1E20D03@inti.gov.ar>
Message-ID: <Pine.SUN.3.91.1010807105615.6564C-100000@is>
MIME-Version: 1.0
Reply-To: djgpp-workers AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp-workers AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

On Mon, 6 Aug 2001, salvador wrote:

> What about writing a replacement for libiconv? A simple library that
> loads a table from disk (using some environment variables to select
> the file) and does a simple 1 to 1 conversion (not unicode).

The method of going through Unicode is better because it only requires
2*N tables for N supported charsets.  What you suggest requires, in
general, N^2 tables (in practice much less, but more than 2*N
nonetheless).

The bloat in libiconv is not because of Unicode, it's because all the
tables are compiled into the code instead of being read from a file.
If someone modifies libiconv to read the tables from files, the bloat
will go away, and you get the bonus of being able to distribute only
those encodings that matter.

> We could provide the most common DOS conversion maps and
> the user write their owns.

This is a _really_ Bad Idea, IMHO: reinventing the conversion tables
is a sure way to a mess, because it's very easy to make mistakes,
especially since a large portion of the so-called ``reference
material'' you can find on the Internet includes inaccuracies and
mistakes.

The Unicode charset database is by far the most reliable source of
conversions.  It is constantly tested and updated, so it can be
trusted more than any other resource.  Don't fall into the trap of
thinking it's easy to reinvent what the Unicode people did: it isn't.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019