delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2001/03/15/18:58:45

From: "Juan Manuel Guerrero" <ST001906 AT HRZ1 DOT HRZ DOT TU-Darmstadt DOT De>
Organization: Darmstadt University of Technology
To: recode-bugs AT iro DOT umontreal DOT ca
Date: Thu, 15 Mar 2001 23:30:00 +0200
MIME-Version: 1.0
Subject: OS/DJGPP specific difficulties with recode 3.6
CC: djgpp-workers AT delorie DOT com
X-mailer: Pegasus Mail for Windows (v2.54DE)
Message-ID: <4B62C66334B@HRZ1.hrz.tu-darmstadt.de>
Reply-To: djgpp-workers AT delorie DOT com

While trying to compile recode 3.6 out-of-the-box
on MSDOS using DJGPP I have found two difficulties:

a) An OS specific issue.
The produced products (recode.exe and librecode.a) fails for
the following tests from the testsuit:

Individual surfaces.
7. ./dumps.m4:3         --- -	Thu Mar 15 15:38:49 2001
FAILED near `dumps.m4:31'
11. ./dumps.m4:92       --- -	Thu Mar 15 15:39:19 2001
FAILED near `dumps.m4:116'
15. ./dumps.m4:174      --- -	Thu Mar 15 15:39:49 2001
FAILED near `dumps.m4:198'
19. ./dumps.m4:256      --- -	Thu Mar 15 15:40:19 2001
FAILED near `dumps.m4:288'
23. ./dumps.m4:353      --- -	Thu Mar 15 15:40:48 2001
FAILED near `dumps.m4:381'
27. ./dumps.m4:442      --- -	Thu Mar 15 15:41:18 2001
FAILED near `dumps.m4:466'
31. ./dumps.m4:522      --- -	Thu Mar 15 15:41:48 2001
FAILED near `dumps.m4:554'
35. ./dumps.m4:619      --- -	Thu Mar 15 15:42:17 2001
FAILED near `dumps.m4:647'
39. ./dumps.m4:708      --- -	Thu Mar 15 15:42:47 2001
FAILED near `dumps.m4:736'
43. ./base64.m4:3       --- -	Thu Mar 15 15:43:16 2001
FAILED near `base64.m4:22'

Individual charsets.

49. ./african.m4:3      FAILED near `african.m4:31'
50. ./african.m4:40     FAILED near `african.m4:62'
51. ./african.m4:71     FAILED near `african.m4:101'
52. ./african.m4:110    FAILED near `african.m4:134'
53. ./african.m4:143    FAILED near `african.m4:162'
56. ./utf7.m4:3         --- -	Thu Mar 15 15:44:10 2001
FAILED near `utf7.m4:20'

Writing `debug-NN.sh' scripts, NN = 7 11 15 19 23 27 31 35 39 43 49 50 51 52 53 56, done

================================================
ERROR: Suite unsuccessful, 16 of 95 tests failed
================================================

I will only show the output that diff produces for the first test:
7. ./dumps.m4:3         --- -	Thu Mar 15 15:38:49 2001
+++ stdout	Thu Mar 15 15:38:48 2001
@@ -1,21 +1,23 @@
- 10
- 97,  10
- 97,  98,  10
- 97,  98,  99,  10
- 97,  98,  99, 100,  10
- 97,  98,  99, 100, 101, 102, 103, 104, 105,  10
+ 13,  10
+ 97,  13,  10
+ 97,  98,  13,  10
+ 97,  98,  99,  13,  10
+ 97,  98,  99, 100,  13,  10
+ 97,  98,  99, 100, 101, 102, 103, 104, 105,  13,  10
  97,  98,  99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
-112, 113, 114, 115,  10
+112, 113, 114, 115,  13,  10
  97,  98,  99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
-112, 113, 114, 115, 116, 117, 118, 119, 122, 121, 122,  65,  66,  67,  10
+112, 113, 114, 115, 116, 117, 118, 119, 122, 121, 122,  65,  66,  67,  13,
+ 10
  97,  98,  99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
 112, 113, 114, 115, 116, 117, 118, 119, 122, 121, 122,  65,  66,  67,  68,
- 69,  70,  71,  72,  73,  74,  75,  76,  77,  10
+ 69,  70,  71,  72,  73,  74,  75,  76,  77,  13,  10
  97,  98,  99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
 112, 113, 114, 115, 116, 117, 118, 119, 122, 121, 122,  65,  66,  67,  68,
  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,  79,  80,  81,  82,  83,
- 84,  85,  86,  87,  10
+ 84,  85,  86,  87,  13,  10
  97,  98,  99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
 112, 113, 114, 115, 116, 117, 118, 119, 122, 121, 122,  65,  66,  67,  68,
  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,  79,  80,  81,  82,  83,
- 84,  85,  86,  87,  88,  89,  90,  48,  49,  50,  51,  52,  53,  54,  10
+ 84,  85,  86,  87,  88,  89,  90,  48,  49,  50,  51,  52,  53,  54,  13,
+ 10
FAILED near `dumps.m4:31'

It is *not* worth to reproduce diff's output for all the other failing tests.
They do not contain any new information. If you inspect the first few lines
of the output you will notice that the reason for the failure is the different
EOL style used. This means that the reference file uses UNIX-style EOL (LF)
and recode produces DOS-style EOL (CRLF). *All* other tests fail due to the
same issue. The reason for this failure is a concatenation of changes in the
recode 3.6 distribution.
1) The file ./m4/microsoft.m4 has been completely removed. This means that not
only the file has been removed, the functionality proveded by it has also been
removed and has **not** been substituted by some other appropiate code. As the
name suggests, microsoft.m4 supplied code needed to detect if DOS/windows is
used as OS. The script configure.in contained code that defined the macro
DEFAULT_CHARSET to IBM-PC or latin-1 based on the result returned by microsoft.m4.
Once again, microsoft.m4 and this code in configure.in has been completely
removed ***without*** trying to reproduce this functionality in some other way.
2) This is the relevant snippet from recode 3.5, function disambiguate_name()
from file ./src/names.c:
  /* Look for a match.  */

  if (!name || !*name)
    if (type == SYMBOL_FIND_AS_CHARSET || type == SYMBOL_FIND_AS_EITHER)
      {
	name = getenv ("DEFAULT_CHARSET");
	if (!name)
	  {
#ifdef DEFAULT_CHARSET
	    name = DEFAULT_CHARSET;
	    if (!*name)
#endif
	      return NULL;


Now, the same snippet from recode 3.6, function disambiguate_name()
from file ./src/names.c:
  /* Look for a match.  */

  if (!name || !*name)
    switch (find_type)
      {
      case ALIAS_FIND_AS_CHARSET:
      case ALIAS_FIND_AS_EITHER:
	name = getenv ("DEFAULT_CHARSET");
	if (!name)
	  name = "char"; /* locale dependent */
	break;

      default:
	return NULL;

The important issue is to notice the use and function of the
macro DEFAULT_CHARSET. With this macro, an OS specific (and appropiate)
charset ****and**** surface (CRLF or LF) is selected for recode 3.5 at
configuration and later at run time. This is **no** longer true for recode 3.6.
Once again, if recode 3.5 is started it will evaluate the environment
variable DEFAULT_CHARSET for getting the appropiate charset. This character set
always implies the used surface. Of course, the average MSDOS/DJGPP user
will never set this value at all. Probably he will not even know that it exist
at all. In this case recode 3.5 will **default** to the content of the macro
DEFAULT_CHARSET and this is IBM-PC. But IBM-PC implies CRLF as surface and this
selection will DTRT for the MSDOS/DJGPP users.
By inspection of the recode 3.6 code it can be seen that it will default to "char"
and this selection implies always LF as surface, making recode 3.6 almost useless
for the mayority of the non-POSIX platform users like the MSDOS/DJGPP ones.
Of course, this behaviour can be changed by the user by setting
DEFAULT_CHARSET=IBM-PC before invoking recode.exe. In this case **non** of the
tests in the testsuit will fail.
It is completely unclear to me why the old (and very well working) code has been
replaced by this ***completely*** posix centric code. This code makes recode
almost useless for non-posix users. Even worse, I have inspected very carefully
the files README, news and recode.texinfo. I have found **nowhere** a reference
to this new program behaviour. A naive DJGPP user that compiles recode 3.6 out
of the box and does **not** run the testsuit (this is probably the normal case)
will get an useless binary and will probably *never* notice it.
Once again, I am not judging about the changes introduced with this version of
recode, but if such drastic changes to the sources are done, this should be at
least documented in the readme or news file so the non-posix OS user becomes
warned about the new behaviour of the binary. I have never been envolved with
recode development so I will **not** propose any code to change this issue.
I do not know how this issue will be handled by Francois Pinard in the future,
so I will not interfer here. For the DJGPP port that I will upload to simtel.net
I will solve this difficulty by use of brute force.
This means, I will simply replace the following code:
	if (!name)
	  name = "char"; /* locale dependent */
	break;

by this one:
	if (!name)
	  name = "IBM-PC";
	break;

This will make recode 3.6 work on WinDos in the same way as recode 3.5 worked.
IMHO, this is what the average DJGPP user will expect.
Once again, I am *not* proposing this code change for stock recode sources.


b) A DJGPP specific issue.
To cope with this issue I will send a patch directly to Francois Pinard.
The patch is long and will boring most of the audience on the different
NGs. The patch will deal only with files from the contrib/ subdir.
The goal of the patch is:
1) remove unneeded files from the contrib/ subdir. This are the files:
   djgpp-README and djgpp-diffs. Both files have been part of the DJGPP
   port of recode 3.4 and are of no use anymore. Experience with the
   DJGPP port of recode 3.5 leads me to the conclusion that these files
   confuse the users. They seem not to know which person (Francois Pinard,
   Wojciech Galazka and Juan Manuel Guerrero) is responsible for what.
   IMHO, there will be no lost if this files are removed. This avoids to have
   duplicated DJGPP specific README and diffs files in the contrib/ subdir.
   Their contents are obsoleted anyway. The patch will remove this files.
2) The patch will create the following files:
     contrib/readme.in
     contrib/config.site
     contrib/configdj.bat
     contrib/configdj.sed
     contrib/fnchange.in
     contrib/recodepo.sh

   All this files are needed to configure and compile recode 3.X out-of-the-box.
   Recode now uses libiconv. Due to the great amount of filenames used in the
   libiconv/ subdir that do **not** fit into the 8.3 MSDOS namespace, some of
   the filenames must be changed by a MSDOS/DJGPP user that wants to compile the
   original distribution out of the box. For this purpose the file fnchange.in
   is supplied. If djtar.exe is used to untar the original distribution,
   fnchange.in will allow to rename the problematic files on the fly. This means
   that files like libiconv/iso8859_1.h will become libiconv/iso/8859_1.h etc.
   configdj.bat, configdj.sed and recodepo.sh will modify the Makefile.ins and
   source files to account for the new directory structure of libiconv. recodepo.sh
   will recode the .po files from the unix charsets to the appropiate DOS codepages.
3) The patch will modify the file contrib/Makefile.am to account for the DJGPP
   specific changes.
Of course, all this changes will *not* interfer with configuation and compilation
of recode on any other platform.

As usual, comments, objections, suggestions, questions are welcome.

Regards,
Guerrero, Juan Manuel

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019