delorie.com/archives/browse.cgi | search |
X-Recipient: | archive-cygwin AT delorie DOT com |
X-Spam-Check-By: | sourceware.org |
Message-ID: | <6910a60810020449yc8a5c3fxae4c278944ef3b32@mail.gmail.com> |
Date: | Thu, 2 Oct 2008 13:49:14 +0200 |
From: | "Reini Urban" <rurban AT x-ray DOT at> |
To: | "=?ISO-8859-1?Q?Bernt_R=F8skar_Brenna?=" <bernt DOT brenna AT gmail DOT com> |
Subject: | Re: Missing file from cygwin's catdoc |
Cc: | "The Cygwin Mailing List" <cygwin AT cygwin DOT com> |
In-Reply-To: | <6910a60810011236k7c451cc3y97d2df61687bbd00@mail.gmail.com> |
MIME-Version: | 1.0 |
References: | <f5ff2d960810010354m2ddd50b2hd7bbcf8eacb22dab AT mail DOT gmail DOT com> <6910a60810011236k7c451cc3y97d2df61687bbd00 AT mail DOT gmail DOT com> |
X-Google-Sender-Auth: | 7722b1fac4faaa73 |
X-IsSubscribed: | yes |
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
List-Id: | <cygwin.cygwin.com> |
List-Unsubscribe: | <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com> |
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
List-Archive: | <http://sourceware.org/ml/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
Sender: | cygwin-owner AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
Delivered-To: | mailing list cygwin AT cygwin DOT com |
X-MIME-Autoconverted: | from quoted-printable to 8bit by delorie.com id m92BoIXA025086 |
2008/10/1 Reini Urban: > 2008/10/1 Bernt Røskar Brenna: >> I believe that the Cygwin package catdoc is missing the file /etc/catdocrc >> >> Without the settings in the file, running catdoc against Word >> documents with Norwegian characters produces strange results. >> >> I have used the following /etc/catdocrc: >> charset_path=/usr/share/catdoc >> map_path=/usr/share/catdoc >> source_charset=cp1252 >> target_charset=8859-1 >> unknown_char='?' > > Thanks for this info. > >> There is another quite strange matter: >> 'catdoc test_catdoc.doc' and 'catdoc -d8859-1 test_catdoc.doc' >> produces different results (with the config file above, that has >> 8859-1 as default). How is that possible? > > Because the defaults are insane. > in: cp1251 out: koi8-r > > ----- version 0.94.2-2 ----- > * Added --with-input=cp1252 --with-output=8859-1 > Was cp1251 to koi8-r as default > * Added /etc/catdocrc (thanks to Bernt Røskar Brenna) > >> $ catdoc test_catdoc.doc >> aeoa >> >> $ catdoc -d8859-1 test_catdoc.doc >> æøå I was wrong before. The source charset is almost always unicode and the target charset is falsely detected on cygwin as US-ASCII. You get "aeoa" because "æøå" translated to US-ASCII is "aeoa". You can override this with use_locale=no in the catdocrc. > I'm just having problems with the new cygport or autoreconfig, > so it doesn't build yet. > I hope I can fix it soon. I also found the build problem and fixed it. The new release will come this evening. -- Reini Urban http://phpwiki.org/ http://murbreak.at/ -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |