X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:message-id:date:from:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; q=dns; s=default; b=Ua1Zgts6/28Bg2ZE1UC2DHQm7Ehzjh2eHm6IM5dshrt H6cC5Ms7RfTK9xHiD5Objkv9gCsBQaoXIIx/EMFD0XITzWCFceEOFShCzxG+HC6R By7vgo5GoqurZlbl91tX1WZV0zRuOlipcX7nTa8s+IVlQzIPygEFri3NNJuXFRVc = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:message-id:date:from:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; s=default; bh=Y0hIGIVgI76e2Kk0ZbliR0hMxog=; b=FKV5WeWhXG+FlSmr1 vbl2E7P2x/Z+ptfZ3mnRXPt4uoAv/S3w0cWAiF7DU9RC8zCbSgp08XSGbVNjd0WT 3LvZMeYCX7vqv8lJcdYdQJI4CjiqorA/AEC8zBVVuaSScYJc2lHIRhAz+HKLp1G0 4xtP/5N0dY6QFN3UgAsWrTbMh0= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.8 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: smtp-vbr12.xs4all.nl Message-ID: <524341E5.6080601@xs4all.nl> Date: Wed, 25 Sep 2013 22:04:53 +0200 From: Erwin Waterlander User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: How does Cygwin handle non-Latin1 man pages? (move to UTF-8?) References: <5241EF7D DOT 9080309 AT xs4all DOT nl> <52433E7A DOT 4070600 AT xs4all DOT nl> In-Reply-To: <52433E7A.4070600@xs4all.nl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Erwin Waterlander schreef, Op 25-9-2013 21:50: > > Erwin Waterlander schreef, Op 24-9-2013 22:01: >> Hi, >> >> As far as I see it, on Cygwin it is assumed that man pages are >> encoded in Latin-1 (ISO-8859-1). >> For instance the man pages of vim. >> >> /usr/share/man/fr/vim.1.gz is encoded in Latin-1. >> >> $ export LANG=fr_FR.UTF-8 >> $ man vim >> >> This will show the French man page correctly. Latin-1 is converted to >> UTF-8. >> >> For the Russian translation of the vim manual I see two files: >> /usr/share/man/ru.UTF-8/man1/vim.1.gz >> /usr/share/man/ru.KOI8-R/man1/vim.1.gz >> >> >> When I type >> $ export LANG=ru_RU.UTF-8 >> $ man vim >> >> I get the English man page, instead of the Russian man page. >> I think because there is no /usr/share/man/ru/man1/vim.1.gz present. >> > > The problem is here that man looks for the manual in these directories > in this order: > /usr/share/man/ru_RU.UTF-8 > /usr/share/man/ru_RU > /usr/share/man/ru > > All three paths are not present on Cygwin. > I could set LANG to ru.UTF-8, but this is not common practice. > Normally you set LANG to ru_RU.UTF-8. Therefore I think that the > non-Latin1 folders under /usr/share/man have the wrong name. > When I set LANG to ru.UTF-8, man finds the Russian man page, but > displays it wrongly. Even when I fix the NROFF line in /etc/man.conf. > Moving /usr/share/man/ru.UTF-8 to /usr/share/man/ru_RU.UTF-8 (and > fixing man.conf) makes the man page display properly. This confirms > that the non-latin1 directories have the wrong name in Cygwin. > >> When I type >> >> $ export LANG=ru_RU.UTF-8 >> $ export LANGUAGE=ru.UTF-8 >> $ man vim >> >> The Russian man page is displayed, but all Russian characters are >> wrongly displayed. >> I think because it is assumed the man page is in Latin-1. >> >> To get a correct display of the Russian man page I need to change >> /etc/man.config >> I change the line with NROFF to: >> NROFF /usr/bin/preconv | /usr/bin/nroff -c -mandoc 2>/dev/null >> >> Now the Russian man page displays correctly, but now all the Latin-1 >> pages display wrongly. > > This can be fixed by adding a coding tag to the first or second line > of the man page, which is understood by preconv. > When I set LANG to fr_FR.UTF-8, move /usr/share/man/fr.UTF-8 to > /usr/share/man/fr_FR.UTF-8, and add this tag to vim.1 > > .\" -*- coding: latin-1; -*- > > The French manual displays properly. Actually this is not working. Somewhere the coding tag is lost, although preconv seems to do a good job. I reported this three years ago to the maintainers of man an groff, but it appears it is still not fixed. > > >> >> So I undo my change in /etc/man.conf >> >> >> On Linux the trend is to convert all man pages to UTF-8 encoding. >> Will Cygwin follow this trend? >> >> > > The following needs to be done in Cygwin to have man pages for all > scripts displayed properly out of the box (assuming an UTF-8 locale > and use of mintty): > > * Rename the non-latin1 directories under /usr/share/man/ to > fr_FR.UTF-8, ru_RU.UTF-8, and so on. > * Change /etc/man.conf to use preconv: > NROFF /usr/bin/preconv | /usr/bin/nroff -c -mandoc 2>/dev/null > * Convert all Latin-1 coded man pages to UTF-8, or add a latin-1 > coding tag on the first line Since the coding tag is not working it is best to convert all man pages to UTF-8. -- Erwin Waterlander http://waterlan.home.xs4all.nl/ -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple