delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2013/09/25/15:50:40

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:message-id:date:from:mime-version:to:subject
:references:in-reply-to:content-type:content-transfer-encoding;
q=dns; s=default; b=rhxahiDPJuPfkFxiHa6494WCiAjNE/gOFMlzKNlLOgq
IS341z5mQyf6/dcG9AsHHBRVd5fJb2J9yjz6N1UhBFZdhwO4Q/HqgJm/AvXoCPjk
xNj/Ln3z59UJLF0DOp5/+7fMEZED94X9qRCCYaHfbCwlEoFDzNHLInkVPyP8ZwVs
=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:message-id:date:from:mime-version:to:subject
:references:in-reply-to:content-type:content-transfer-encoding;
s=default; bh=Dlk/5lTA5etQLYhHzFNuX1zX3lY=; b=LMYZcpPDa70f8Y5ls
Vsf1fvjyLvXdxCJGMdzqr/UQh68eT4ulpJR60RnjK9zGfUHN2ioTLFycHa1tTmc3
MAmxpqrhUn6wXWTLru1+DBxHaAf4DA89xs4+Sx48fJG3sUKP2/qAfGvcGQGBqUt3
ELpN9c+b0Oy2rm/bUeAaeh31T8=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-3.8 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,RP_MATCHES_RCVD autolearn=ham version=3.3.2
X-HELO: smtp-vbr14.xs4all.nl
Message-ID: <52433E7A.4070600@xs4all.nl>
Date: Wed, 25 Sep 2013 21:50:18 +0200
From: Erwin Waterlander <waterlan AT xs4all DOT nl>
User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:17.0) Gecko/20130801 Thunderbird/17.0.8
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: How does Cygwin handle non-Latin1 man pages? (move to UTF-8?)
References: <5241EF7D DOT 9080309 AT xs4all DOT nl>
In-Reply-To: <5241EF7D.9080309@xs4all.nl>
X-IsSubscribed: yes

Erwin Waterlander schreef, Op 24-9-2013 22:01:
> Hi,
>
> As far as I see it, on Cygwin it is assumed that man pages are encoded 
> in Latin-1 (ISO-8859-1).
> For instance the man pages of vim.
>
> /usr/share/man/fr/vim.1.gz is encoded in Latin-1.
>
> $ export LANG=fr_FR.UTF-8
> $ man vim
>
> This will show the French man page correctly. Latin-1 is converted to 
> UTF-8.
>
> For the Russian translation of the vim manual I see two files:
> /usr/share/man/ru.UTF-8/man1/vim.1.gz
> /usr/share/man/ru.KOI8-R/man1/vim.1.gz
>
>
> When I type
> $ export LANG=ru_RU.UTF-8
> $ man vim
>
> I get the English man page, instead of the Russian man page.
> I think because there is no /usr/share/man/ru/man1/vim.1.gz present.
>

The problem is here that man looks for the manual in these directories 
in this order:
/usr/share/man/ru_RU.UTF-8
/usr/share/man/ru_RU
/usr/share/man/ru

All three paths are not present on Cygwin.
I could set LANG to ru.UTF-8, but this is not common practice. Normally 
you set LANG to ru_RU.UTF-8. Therefore I think that the non-Latin1 
folders under /usr/share/man have the wrong name.
When I set LANG to ru.UTF-8, man finds the Russian man page, but 
displays it wrongly. Even when I fix the NROFF line in /etc/man.conf.
Moving /usr/share/man/ru.UTF-8 to /usr/share/man/ru_RU.UTF-8 (and fixing 
man.conf) makes the man page display properly. This confirms that the 
non-latin1 directories have the wrong name in Cygwin.

> When I type
>
> $ export LANG=ru_RU.UTF-8
> $ export LANGUAGE=ru.UTF-8
> $ man vim
>
> The Russian man page is displayed, but all Russian characters are 
> wrongly displayed.
> I think because it is assumed the man page is in Latin-1.
>
> To get a correct display of the Russian man page I need to change 
> /etc/man.config
> I change the line with NROFF to:
> NROFF         /usr/bin/preconv | /usr/bin/nroff -c -mandoc 2>/dev/null
>
> Now the Russian man page displays correctly, but now all the Latin-1 
> pages display wrongly.

This can be fixed by adding a coding tag to the first or second line of 
the man page, which is understood by preconv.
When I set LANG to fr_FR.UTF-8, move /usr/share/man/fr.UTF-8 to 
/usr/share/man/fr_FR.UTF-8, and add this tag to vim.1

.\" -*- coding: latin-1; -*-

The French manual displays properly.


>
> So I undo my change in /etc/man.conf
>
>
> On Linux the trend is to convert all man pages to UTF-8 encoding.
> Will Cygwin follow this trend?
>
>

The following needs to be done in Cygwin to have man pages for all 
scripts displayed properly out of the box (assuming an UTF-8 locale and 
use of mintty):

* Rename the non-latin1 directories under /usr/share/man/ to 
fr_FR.UTF-8, ru_RU.UTF-8, and so on.
* Change /etc/man.conf to use preconv:
NROFF         /usr/bin/preconv | /usr/bin/nroff -c -mandoc 2>/dev/null
* Convert all Latin-1 coded man pages to UTF-8, or add a latin-1 coding 
tag on the first line.

regards,

-- 
Erwin Waterlander
http://waterlan.home.xs4all.nl/


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019