delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2017/12/12/14:42:53

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:references:from:to:message-id:date
:mime-version:in-reply-to:content-type; q=dns; s=default; b=nB4k
+7lUSgBBxreB3LPc6lV5h6igBvBtJRQ5E/z67a0LKoIc7zfhvrSscNPGKlCvoU1y
HIyfmA5ukHk8ttvIY893zijrlKszes7JlpnRikbrfTMzdjeZm/Ys5H4dLpXVUH1u
o4PGmSiBbcg9g7SV7VlprCmyxV1iefuO+2W6Y68=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:references:from:to:message-id:date
:mime-version:in-reply-to:content-type; s=default; bh=5+R2XQcfVy
L52QSBKSEX/pYvjUU=; b=hIIiyFfesx8z6344aGWTXaqb77qdf0sWqj0K/h7Ych
3ObUHXFD2W6sKR6eBM7xyJ85wvRlsN4aQtGSa/dHELInfVCYxUpSGk6LrnyQfhvN
Q7LSppJFrjBcwkuzu2HIeMDiAEHycJSblsNAtv0b/XRqAYd3eY+G41jkc3/WxYfW
E=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=2.8 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,LIKELY_SPAM_SUBJECT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=no version=3.3.2 spammy=Euro, Sharp, percent, dash
X-HELO: mail-qt0-f177.google.com
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:references:from:to:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=UISQ/SEEU0jmJ+I7YTNQzOu3IqqRlBdMfcY93vO/+yA=; b=G//E1hq/kYYI4rXMAXu3lGj4gw61BbzjXioJ7w5UHmTdR8LCb3/j0OZ/6KtIS2Dtgd faEBVQF5gYcToWN8ArLNR36Dw2cPcY2/DLF3nhTY7ReJyEIcJ9PEyPBydjNrawk/m2U+ V5j04UIfW+EbEmoPxOXIyH5+H78zUsY+11yN4OJwjDXmkNYmn7IzZEhJQ4vRG64ZcnTA XIpaDS1f1/Noz5ESGLgZWXx7y8aBrnWlpor4Kfae1wOLxxaQvBQ9xmMuu3+i6OcG2jV6 RYq3uzEht71gKQ9gwSB7fQCQC7XFFMD0ILvdzKCl4ahIRAyoufY7Megtr/XLVPngiugx 1m/Q==
X-Gm-Message-State: AKGB3mIGzFeTJLAO2QdELYtqP8YQVcU6DGmNbC+uleHUc3iKlffxey7r 4n8S/ltSCgUJB1Ow5Olpbq4=
X-Google-Smtp-Source: ACJfBouWk1KSS3Rf0jvN/a844/KvrQjs0mCvqG1uglekakUcb1qCiLNaOskrr+JB/I/2LbbksdKqEA==
X-Received: by 10.200.41.145 with SMTP id 17mr7266137qts.239.1513107758017; Tue, 12 Dec 2017 11:42:38 -0800 (PST)
Subject: Re: Need help with multibyte UTF-8 characters
References: <626a3c06-e9f2-1932-f1f3-47ddb2051215 AT gmail DOT com> <9d3b73ff-f596-51a2-909a-30a767e3e9b3 AT gmail DOT com>
From: Thomas Taylor <tayloth AT gmail DOT com>
To: cygwin AT cygwin DOT com
Message-ID: <1909177a-3f35-52d5-1717-9007d6efaa71@gmail.com>
Date: Tue, 12 Dec 2017 14:42:38 -0500
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0
MIME-Version: 1.0
In-Reply-To: <9d3b73ff-f596-51a2-909a-30a767e3e9b3@gmail.com>

--------------F45A94644063078C0C6E8549
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit

I believe that Cygwin displays certain UTF-8 characters incorrectly.  To 
see the problem, first save the attached "utf-8_test.sed" text file to 
your desktop.  Then run "mintty," and set its options by right clicking 
in its title bar, selecting "Options" and then "Text."  On the Text page 
set "Locale" to "en_US" and "Character set" to "UTF-8," and then 
"Save."  Now exit and restart mintty.  Change directory to your desktop 
and run the editor "vim" on the utf-8_test.sed file.  Once inside vim do 
a ":set fileencoding=utf-8".  You should now see that vim displays 
correctly a sample of one-, two-, and three-byte UTF-8 character 
encodings in the test file.  Vim fails, however, on the three-byte 
encodings for the "en" dash, the "em" dash, and the ellipsis, each of 
which displays incorrectly as a filled-in rectangle.  Now exit vim and 
do a "less" or "cat" on the utf-8_test.sed file.  You should see most of 
the sample UTF-8 encoded characters displayed correctly, except once 
again for the en dash, em dash, and ellipsis.  So it looks like a 
problem in the underlying Cygwin run-time libraries rather than in vim, 
less, or cat.  I haven't tested this on four-byte UTF-8 character 
encodings, but assume Cygwin will have similar problems.


--------------F45A94644063078C0C6E8549
Content-Type: text/plain; charset=UTF-8;
 name="utf-8_test.sed"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="utf-8_test.sed"

IyBUaGlzIGlzIGZpbGUgInV0Zi04X3Rlc3Quc2VkIgojCiMgSXQncyB1c2Vk
IGJ5IHRoZSAic2VkIiB1dGlsaXR5IHByb2dyYW0KIyB0byBjb252ZXJ0IFhN
TC1lbmNvZGVkIGZpbGVuYW1lcyB0byBVVEYtOAoKIyBNYXRjaCBsb25nZXN0
IHN0cmluZ3MgZmlyc3QKCiMgVGhyZWUtYnl0ZSBlbmNvZGluZ3M6CgojIEVu
IGRhc2gKcy8lW0VlXTIlODAlOTMv4oCTL2cKCiMgRW0gZGFzaApzLyVbRWVd
MiU4MCU5NC/igJQvZwoKIyBIb3Jpem9udGFsIGVsbGlwc2lzCnMvJVtFZV0y
JTgwJVtBYV02L+KApi9nCgojIExlc3MtdGhhbi1vci1lcXVhbCBzaWduCnMv
JVtFZV0yJTg5JVtBYV00L+KJpC9nCgojIEV1cm8gc3ltYm9sCnMvJVtFZV0y
JTgyJVtBYV1bQ2NdL+KCrC9nCgojIFR3by1ieXRlIGVuY29kaW5nczoKCiMg
Tm9uLWJyZWFrIHNwYWNlCnMvJVtDY10yJVtBYV0wL+KOtS9nCgojIExvd2Vy
Y2FzZSBhIHdpdGggYWN1dGUgYWNjZW50CnMvJVtDY10zJVtBYV0xL8OhL2cK
CiMgTG93ZXJjYXNlIGEgd2l0aCB1bWxhdXQgKGEuay5hLiBkaWFlcmVzaXMp
CnMvJVtDY10zJVtBYV00L8OkL2cKCiMgTG93ZXJjYXNlIGUgd2l0aCBhY3V0
ZSBhY2NlbnQKcy8lW0NjXTMlW0FhXTkvw6kvZwoKIyBMb3dlcmNhc2UgaSB3
aXRoIGFjdXRlIGFjY2VudApzLyVbQ2NdMyVbQWFdRC/DrS9nCgojIExvd2Vy
Y2FzZSBvIHdpdGggYWN1dGUgYWNjZW50CnMvJVtDY10zJVtCYl0zL8OzL2cK
CiMgTG93ZXJjYXNlIG4gd2l0aCB0aWxkZQpzLyVbQ2NdMyVbQmJdMS/DsS9n
CgojIExvd2VyY2FzZSBjIHdpdGggYWN1dGUgYWNjZW50IApzLyVbQ2NdNCU4
Ny/Ehy9nCgojIExvd2VyY2FzZSBvIHdpdGggbG9uZyBhY2NlbnQgKGEuay5h
LiBtYWNyb24pCnMvJVtDY101JThbRGRdL8WNL2cKCiMgT25lLWJ5dGUgZW5j
b2RpbmdzOgoKIyAiQW5kIiBzaWduIChhLmsuYS4gYW1wZXJzYW5kKQpzLyYj
Mzg7L1wmL2cKCiMgU3BhY2UKcy8lMjAvIC9nCgojIFNoYXJwIChvciBwb3Vu
ZCkgc2lnbgpzLyUyMy8jL2cKCiMgUGVyY2VudCBzaWduCnMvJTI1LyUvZwoK
IyBMZWZ0IHNxdWFyZSBicmFja2V0CnMvJTVbQmJdL1svZwoKIyBSaWdodCBz
cXVhcmUgYnJhY2tldApzLyU1W0RkXS9dL2cKCiMgRW5kIG9mIGZpbGUgInV0
Zi04X3Rlc3Quc2VkIgoK


--------------F45A94644063078C0C6E8549
Content-Type: text/plain; charset=us-ascii


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
--------------F45A94644063078C0C6E8549--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019