delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2010/04/19/07:08:52

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=1.0 required=5.0 tests=BAYES_50,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SARE_SUB_ENC_UTF8,T_TO_NO_BRKTS_FREEMAIL
X-Spam-Check-By: sourceware.org
To: cygwin AT cygwin DOT com
Subject: Re: Cygwin 1.7 man: '-' char in option/switch in man page is not displayed/not encoded well if LANG=C.UTF-8
References: <28287625 DOT post AT talk DOT nabble DOT com>
Date: Mon, 19 Apr 2010 13:08:38 +0300
MIME-Version: 1.0
From: "Matthias Andree" <matthias DOT andree AT gmx DOT de>
Message-ID: <op.vbeo8oeb1e62zd@balu.cs.uni-paderborn.de>
In-Reply-To: <28287625.post@talk.nabble.com>
User-Agent: Opera Mail/10.51 (Win32)
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

LiuYan 刘研 wrote on 2010-04-19:

>
> After moved from Cygwin 1.5 to Cygwin 1.7, the '-' char in switch/option  
> in
> man page is not displayed.
>
> As Cygwin 1.7 have revised to 1.7.5 and cygwin-doc-1.7 is released and  
> this
> problem still exists, so I decide to figure it out.
>
> I have a previous post "Cygwin 1.7: Empty/white-space output when display
> Chinese characters in GBK charset encoding?" here
> http://old.nabble.com/Cygwin-1.7%3A-Empty-white-space-output-when-display-Chinese-characters-in-GBK-charset-encoding--ts26774467.html,
> It seems this problem is similar to that one, because if i set  
> LANG=C.GBK,
> '-' char will displayed well. But Cygwin 1.7 select UTF-8 as default
> encoding, is it 'man' does not follow this default encoding?
>
> please see the screenshot from old.nabble.com:
> http://old.nabble.com/file/p28287625/cygwin-1.7-LANG%253DC.UTF-8-man.png
> cygwin-1.7-LANG%3DC.UTF-8-man.png


Short story: groff 1.20.1 seems to fix this.


I didn't follow all of this, but basically it's that groff macros would  
use U+2212 (MINUS SIGN, −) or U+2010 (HYPHEN, -) for "\-" and sometimes  
"-" when they could be using U+002D (HYPHEN-MINUS, -) for compatibility
This has been discussed on the groff lists three years ago, see the thread  
starting at http://www.mail-archive.com/groff AT gnu DOT org/msg03657.html for  
reference.

There is also a related groff commit,

2009-01-03  Werner LEMBERG  <address omitted>

	* tmac/an-old.tmac, tmac/doc.tmac: For -Tutf8, map \-, -, ', and `
	conservatively to ASCII for the sake of easy cut and paste.

While it is meant for cut and paste, it would incidentally also fix  
searching.

Relevant changes are:

http://cvs.savannah.gnu.org/viewvc/groff/tmac/an-old.tmac?root=groff&r1=1.61&r2=1.62&sortby=date
http://cvs.savannah.gnu.org/viewvc/groff/tmac/doc.tmac?root=groff&r1=1.38&r2=1.39&sortby=date

It appears that this got fixed in groff 1.20.1 (I just tried this on  
1.7.5(0.225/5/3) with mintty 0.6.1 on Windows 7 Pro German), so an update  
to groff might fix this for good.

-- 
Matthias Andree

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019