X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:reply-to:subject:to:references:from:message-id :date:mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=kUwcIFd6AvV54FMD T2G8dixrWo2Ycz090/kp/RWDxjL5FAyJswQ5Y4LSmzlGAT799fv0LgmGzFqqmsIo l8zdMSNe/Kh9WM9PSHVmJGyaz+wCDzvSvl/AoQ2vQiyxkmRogDScHa6Z75QrJdBi PcwZ6Z2YEwoxTN3/O4fY8DSji28= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:reply-to:subject:to:references:from:message-id :date:mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=BTEibvovjmN5EJQvjskrKV 6Z4xo=; b=Jf3eud1ZB/J4YffyUSMnSVMNr0cyQbmY9Ke1mL08mXoQrhqKOtcyAW 16RG+iiF5miLeQi+2ekbL6kRCc6NfrwCGrBDl53Cp7itFxyHjg8DCiv7BNuMnQxA 3jP1XLSL+pl7tdEUWKsF/WTm6hKIK7gvnWr/loW7KYnJBAlO88Al0= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-6.2 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_1,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=UD:savannah.gnu.org X-HELO: smtp-out-no.shaw.ca Reply-To: Brian DOT Inglis AT SystematicSw DOT ab DOT ca Subject: Re: Cygwin fails to utilize Unicode replacement character To: cygwin AT cygwin DOT com References: <20180903210258 DOT GC6350 AT calimero DOT vinschen DOT de> <5b8db27e DOT 1c69fb81 DOT e3b47 DOT 6cd8 AT mx DOT google DOT com> From: Brian Inglis Openpgp: preference=signencrypt Message-ID: <5251efa5-e7a0-883b-f9e9-f76606e9ee52@SystematicSw.ab.ca> Date: Tue, 4 Sep 2018 00:06:39 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <5b8db27e.1c69fb81.e3b47.6cd8@mx.google.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes On 2018-09-03 16:15, Steven Penny wrote: > On Mon, 3 Sep 2018 23:02:58, Corinna Vinschen wrote: >> I can't. I only have a limited set of fonts available in the console. Install dejavu-fonts package or just DejaVu Sans Mono font from: https://dejavu-fonts.github.io/Download.html http://sourceforge.net/projects/dejavu/files/dejavu/2.37/dejavu-fonts-ttf-2.37.tar.bz2 or see what glyph is at index 0 (.notdef)? > http://superuser.com/questions/390933/add-font-cmd-window-choices/956818 For Windows support, from Explorer I just search all *.[ot]tf under ...CygRoot.../usr/share/fonts/ and copy into /Windows/Fonts/ >> What I just did was calling the GetFontUnicodeRanges function >> for each font, and it turns out that none of the fonts support >> 0xfffd "REPLACEMENT CHARACTER", but all three support 0xfffc >> "OBJECT REPLACEMENT CHARACTER".  I expanded the testcase to check >> for this with GetGlyphIndicesW and, lo and behold, the result >> makes sense. >> On the other hand, during testing I saw a 0xfffd character printed for >> these fonts.  None of them actually supports 0xfffd, so apparently the >> Windows console already uses replacement fonts if possible. >> I guess I just stop here and always print 0xfffd.  I seriously doubt >> it makes sense to add so much code just to print a single char in a >> border case. > this is not possible; most likely you were seeing the ".notdef glyph": > http://docs.microsoft.com/typography/opentype/spec/recom > for Consolas which is simlar in appearance to U+FFFD REPLACEMENT CHARACTER. The > differnce is that if you copy the ".notdef glyph" and paste it into "Notepad" or > similar, it will paste the proper character that couldnt be seen in the console, > while pasting U+FFFD into "Notepad" will just paste itself. > Expanding on the "Notepad" example, "Notepad" default font is "Lucida Console", > which doesnt have U+FFFD either. However pasting into "Notepad" will still show > U+FFFD properly because "Tahoma" has U+FFFD and "Notepad" can utilize composite > font, while it appears "cmd.exe" and similar cannot. You can use Windows font linking to use glyphs from linked fonts like: . GNU Unifont showing bitmap glyphs for BMP code points - release 8.0.1 is available in Cygwin package unifont-fonts - latest below is 11.0.2 https://savannah.gnu.org/projects/unifont http://unifoundry.com/unifont/index.html . Evertype Last Resort font provided by Apple showing standard representative Unicode block glyphs with the code point in the wide glyph border https://www.unicode.org/policies/lastresortfont_eula.html . SIL Fallback showing the BMP code point inside a box https://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=UnicodeBMPFallbackFont -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple