delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2004/06/02/10:05:55

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
X-Authentication-Warning: slinky.cs.nyu.edu: pechtcha owned process doing -bs
Date: Wed, 2 Jun 2004 10:04:34 -0400 (EDT)
From: Igor Pechtchanski <pechtcha AT cs DOT nyu DOT edu>
Reply-To: cygwin AT cygwin DOT com
To: "Gerrit P. Haase" <freeweb AT nyckelpiga DOT de>
cc: "Buchbinder, Barry (NIH/NIAID)" <BBuchbinder AT niaid DOT nih DOT gov>,
cygwin AT cygwin DOT com
Subject: Re: Error message from antiword since upgrade to cygwin 1.5.10
In-Reply-To: <123-1790318362.20040602140951@familiehaase.de>
Message-ID: <Pine.GSO.4.58.0406020949510.18478@slinky.cs.nyu.edu>
References: <F76C9B2DA2FC4C4CA0A18E288BBCBCF70821799F AT nihexchange24 DOT nih DOT gov> <123-1790318362 DOT 20040602140951 AT familiehaase DOT de>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.39

On Wed, 2 Jun 2004, Gerrit P. Haase wrote:

> Barry wrote:
>
> > Syntax error in: '0x00  0x0000  #       NULL'
>
> > It does this with several different MS Word files that (I think) haven't
> > changed since the upgrade.
>
> > But antiword _seems_ to work OK (or at least as well as before the upgrade).
>
> Interesting.  Have the codepage mapping files been modified?
>
> Please have a look in /usr/share/antiword, the three files cp1250.txt,
> cp1251.txt and cp1252.txt should contain "0x00 0x0000  #NULL" at the
> first line of the definitions without space between '#' and 'NULL'.
>
> Gerrit

Gerrit,

The line '0x00 0x0000 # NULL' (with or without whitespace between '#' and
'NULL') appears in most of the mapping files (except roman.txt,
MacRoman.txt, and UTF-8.txt), not just cp125[012].txt -- just "grep NULL
*.txt".  It doesn't seem to make much difference, but the three files you
listed above are in DOS (CRLF) format, roman.txt is in Mac (CR) format,
and the rest are in Unix (LF) format.

The error above can be reproduced by running "antiword -m <filename>",
where <filename> is a mapping file.  FWIW, "antiword -m roman.txt"
produces gobs of errors.  IIUC, the '#' should start a comment, so
whitespace differences after '#' shouldn't matter.

Incidentally, it used to be possible to specify the mapping file name
without the .txt at the end (e.g., "antiword -m cp1251").  It now seems
necessary to add the ".txt" to the filename.

HTH,
	Igor
-- 
				http://cs.nyu.edu/~pechtcha/
      |\      _,,,---,,_		pechtcha AT cs DOT nyu DOT edu
ZZZzz /,`.-'`'    -.  ;-;;,_		igor AT watson DOT ibm DOT com
     |,4-  ) )-,_. ,\ (  `'-'		Igor Pechtchanski, Ph.D.
    '---''(_/--'  `-'\_) fL	a.k.a JaguaR-R-R-r-r-r-.-.-.  Meow!

"I have since come to realize that being between your mentor and his route
to the bathroom is a major career booster."  -- Patrick Naughton

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019