delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2018/09/04/02:07:00

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:reply-to:subject:to:references:from:message-id
:date:mime-version:in-reply-to:content-type
:content-transfer-encoding; q=dns; s=default; b=kUwcIFd6AvV54FMD
T2G8dixrWo2Ycz090/kp/RWDxjL5FAyJswQ5Y4LSmzlGAT799fv0LgmGzFqqmsIo
l8zdMSNe/Kh9WM9PSHVmJGyaz+wCDzvSvl/AoQ2vQiyxkmRogDScHa6Z75QrJdBi
PcwZ6Z2YEwoxTN3/O4fY8DSji28=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:reply-to:subject:to:references:from:message-id
:date:mime-version:in-reply-to:content-type
:content-transfer-encoding; s=default; bh=BTEibvovjmN5EJQvjskrKV
6Z4xo=; b=Jf3eud1ZB/J4YffyUSMnSVMNr0cyQbmY9Ke1mL08mXoQrhqKOtcyAW
16RG+iiF5miLeQi+2ekbL6kRCc6NfrwCGrBDl53Cp7itFxyHjg8DCiv7BNuMnQxA
3jP1XLSL+pl7tdEUWKsF/WTm6hKIK7gvnWr/loW7KYnJBAlO88Al0=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-6.2 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_1,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=UD:savannah.gnu.org
X-HELO: smtp-out-no.shaw.ca
Reply-To: Brian DOT Inglis AT SystematicSw DOT ab DOT ca
Subject: Re: Cygwin fails to utilize Unicode replacement character
To: cygwin AT cygwin DOT com
References: <20180903210258 DOT GC6350 AT calimero DOT vinschen DOT de> <5b8db27e DOT 1c69fb81 DOT e3b47 DOT 6cd8 AT mx DOT google DOT com>
From: Brian Inglis <Brian DOT Inglis AT SystematicSw DOT ab DOT ca>
Openpgp: preference=signencrypt
Message-ID: <5251efa5-e7a0-883b-f9e9-f76606e9ee52@SystematicSw.ab.ca>
Date: Tue, 4 Sep 2018 00:06:39 -0600
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <5b8db27e.1c69fb81.e3b47.6cd8@mx.google.com>
X-IsSubscribed: yes

On 2018-09-03 16:15, Steven Penny wrote:
> On Mon, 3 Sep 2018 23:02:58, Corinna Vinschen wrote:
>> I can't. I only have a limited set of fonts available in the console.

Install dejavu-fonts package or just DejaVu Sans Mono font from:

	https://dejavu-fonts.github.io/Download.html
	http://sourceforge.net/projects/dejavu/files/dejavu/2.37/dejavu-fonts-ttf-2.37.tar.bz2

or see what glyph is at index 0 (.notdef)?

> http://superuser.com/questions/390933/add-font-cmd-window-choices/956818

For Windows support, from Explorer I just search all *.[ot]tf under
...CygRoot.../usr/share/fonts/ and copy into /Windows/Fonts/

>> What I just did was calling the GetFontUnicodeRanges function
>> for each font, and it turns out that none of the fonts support
>> 0xfffd "REPLACEMENT CHARACTER", but all three support 0xfffc
>> "OBJECT REPLACEMENT CHARACTER".  I expanded the testcase to check
>> for this with GetGlyphIndicesW and, lo and behold, the result
>> makes sense.

>> On the other hand, during testing I saw a 0xfffd character printed for
>> these fonts.  None of them actually supports 0xfffd, so apparently the
>> Windows console already uses replacement fonts if possible.
>> I guess I just stop here and always print 0xfffd.  I seriously doubt
>> it makes sense to add so much code just to print a single char in a
>> border case.

> this is not possible; most likely you were seeing the ".notdef glyph":
> http://docs.microsoft.com/typography/opentype/spec/recom
> for Consolas which is simlar in appearance to U+FFFD REPLACEMENT CHARACTER. The
> differnce is that if you copy the ".notdef glyph" and paste it into "Notepad" or
> similar, it will paste the proper character that couldnt be seen in the console,
> while pasting U+FFFD into "Notepad" will just paste itself.
> Expanding on the "Notepad" example, "Notepad" default font is "Lucida Console",
> which doesnt have U+FFFD either. However pasting into "Notepad" will still show
> U+FFFD properly because "Tahoma" has U+FFFD and "Notepad" can utilize composite
> font, while it appears "cmd.exe" and similar cannot.

You can use Windows font linking to use glyphs from linked fonts like:

. GNU Unifont showing bitmap glyphs for BMP code points - release 8.0.1 is
available in Cygwin package unifont-fonts - latest below is 11.0.2

	https://savannah.gnu.org/projects/unifont
	http://unifoundry.com/unifont/index.html

. Evertype Last Resort font provided by Apple showing standard representative
Unicode block glyphs with the code point in the wide glyph border

	https://www.unicode.org/policies/lastresortfont_eula.html

. SIL Fallback showing the BMP code point inside a box

	https://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=UnicodeBMPFallbackFont

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019