Mail Archives: cygwin/2004/06/02/10:05:55
On Wed, 2 Jun 2004, Gerrit P. Haase wrote:
> Barry wrote:
>
> > Syntax error in: '0x00 0x0000 # NULL'
>
> > It does this with several different MS Word files that (I think) haven't
> > changed since the upgrade.
>
> > But antiword _seems_ to work OK (or at least as well as before the upgrade).
>
> Interesting. Have the codepage mapping files been modified?
>
> Please have a look in /usr/share/antiword, the three files cp1250.txt,
> cp1251.txt and cp1252.txt should contain "0x00 0x0000 #NULL" at the
> first line of the definitions without space between '#' and 'NULL'.
>
> Gerrit
Gerrit,
The line '0x00 0x0000 # NULL' (with or without whitespace between '#' and
'NULL') appears in most of the mapping files (except roman.txt,
MacRoman.txt, and UTF-8.txt), not just cp125[012].txt -- just "grep NULL
*.txt". It doesn't seem to make much difference, but the three files you
listed above are in DOS (CRLF) format, roman.txt is in Mac (CR) format,
and the rest are in Unix (LF) format.
The error above can be reproduced by running "antiword -m <filename>",
where <filename> is a mapping file. FWIW, "antiword -m roman.txt"
produces gobs of errors. IIUC, the '#' should start a comment, so
whitespace differences after '#' shouldn't matter.
Incidentally, it used to be possible to specify the mapping file name
without the .txt at the end (e.g., "antiword -m cp1251"). It now seems
necessary to add the ".txt" to the filename.
HTH,
Igor
--
http://cs.nyu.edu/~pechtcha/
|\ _,,,---,,_ pechtcha AT cs DOT nyu DOT edu
ZZZzz /,`.-'`' -. ;-;;,_ igor AT watson DOT ibm DOT com
|,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D.
'---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow!
"I have since come to realize that being between your mentor and his route
to the bathroom is a major career booster." -- Patrick Naughton
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -