delorie.com/archives/browse.cgi | search |
X-Recipient: | archive-cygwin AT delorie DOT com |
DomainKey-Signature: | a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id |
:list-unsubscribe:list-subscribe:list-archive:list-post | |
:list-help:sender:subject:to:references:from:message-id:date | |
:mime-version:in-reply-to:content-type | |
:content-transfer-encoding; q=dns; s=default; b=DYEN2yqM5ojk/H3l | |
Gux11u9cTpwjv3rxuU8zmFXyaMIBhrEVBNILFY9mezO2KbPb2ok/phqVCw2sGtbt | |
5GOOzgtp56PCAP8hfYWVIoMEcTDWgGKBwvTBZ6IAb2M+GAOycSAA1ifC6GLu0qW5 | |
lz+3cbgpm7olH0VDMefAQkc3iuk= | |
DKIM-Signature: | v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id |
:list-unsubscribe:list-subscribe:list-archive:list-post | |
:list-help:sender:subject:to:references:from:message-id:date | |
:mime-version:in-reply-to:content-type | |
:content-transfer-encoding; s=default; bh=12tl0b/Ps0nOQ1DkY6VfY4 | |
UdyUA=; b=XAZapi2/jpzrOf3Gaol4Rwmoj7iaf5sJIXDm4G5P+U8SnmJQP7rHso | |
yonfevY+TKOB/2poec4G3+84cJrcAdIvuZtvu4SxfsVcseVbqwCt+5/sFivB6AMu | |
nf3hl8FrpRxw919VMnXEtcqhEOTYESR2TcKc22Qsuq5ZYqKCOJil8= | |
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
List-Id: | <cygwin.cygwin.com> |
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
List-Archive: | <http://sourceware.org/ml/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
Sender: | cygwin-owner AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
Delivered-To: | mailing list cygwin AT cygwin DOT com |
Authentication-Results: | sourceware.org; auth=none |
X-Spam-SWARE-Status: | No, score=0.9 required=5.0 tests=AWL,BAYES_40,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.2 spammy=conclusion, plane, visually, UD:imgur.com |
X-HELO: | mout.kundenserver.de |
Subject: | Re: Issues with width of emoji |
To: | cygwin AT cygwin DOT com |
References: | <CAEpCGDR1DDu1rPUge1H_LoGDP0gG4fd6S+BDKZ=f9oni6=eU8Q AT mail DOT gmail DOT com> |
From: | Thomas Wolff <towo AT towo DOT net> |
Message-ID: | <70b0e3e9-d961-983c-0c95-5a0ace8a99e6@towo.net> |
Date: | Fri, 21 Sep 2018 19:43:13 +0200 |
User-Agent: | Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
MIME-Version: | 1.0 |
In-Reply-To: | <CAEpCGDR1DDu1rPUge1H_LoGDP0gG4fd6S+BDKZ=f9oni6=eU8Q@mail.gmail.com> |
X-IsSubscribed: | yes |
Am 21.09.2018 um 03:42 schrieb Pokechu22: > Hello! For a while I've had issues with emoji and cygwin, but due to > some recent configuration changes on my end it's gotten to the point > where it's actively causing problems. Some of your problem descriptions are staying a bit obscure, e.g. what recent changes have caused which problems... > My specific case involves running weechat on my rapsberry pi, which I > connect to with `mosh pi -- screen -D -RR weechat > /usr/local/bin/weechat-curses`. When someone used an emoji on IRC, > the entire screen would get messed up in some cases, as things got > misaligned (an example of this: https://i.imgur.com/V7D6jPc.png). > Previously I had a script that converted emoji into their escapes, What are "their escapes"? Emojis are encoded in Unicode directly, not needing any escapes then. > but that recently started misbehaving; even with that script there were > other unicode characters such as the mathematical alphanumeric symbols > characters (<https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols>) Unicode does not define any emojis in the range Mathematical Alphanumeric Symbols (U+1D400-U+1D7FF). > that caused the issue too; I'm still going to refer to these as emoji > because I most commonly have this problem wtih emoji and I don't have > a good name otherwise. > > I initially assumed that this was a problem with mosh on the pi, what > with the pi being an ARM device. However, after later investigation, > it turns out that it's a cygwin problem. Some different cases where > things behave weirdly: > > * Typing an emoji and then pressing backspace twice ends up deleting > the emoji and the character before visually, but the character before > isn't actually deleted (e.g. echo hi<emoji> then backspace twice still > prints hi) See your own conclusion below. > * Running mosh, even as a loopback (`mosh --local ::1`), shows 2 > characters when the emoji is typed > * Emoji behave incorrectly when pasted into nano > * curses apps (which include mosh and nano) write a 2-wide space for > emoji, as can be seen in this script > <https://gist.github.com/Pokechu22/45d19aa5e41ee6db00723f808ac4339e>. > This is only 1 character wide on my pi. This may be related to different Unicode versions. Width for many emojis changed from 1 to 2 in Unicode 9 (I think). > * There are no problems when using SSH, at least to my pi, interestingly. So please describe how you connect when the same test cases behave differently. > * Python refuses to create a ctypes.c_wchar containing an emoji, but > considers the len of a string with a single emoji to be 1. On my pi > it creates a c_wchar properly. > > I think that most of the desyncs and other weird things I've been > getting are a result of different systems disagreeing about how wide > the character should be; Yes, and of different applications. Do you actually run the cygwin terminal or the cygwin console for your test cases? > that makes the most sense at least. > Alternatively, it might be an issue with the character being > represented as multiple characters; as far as I can tell there are > only problems with characters outside of the basic multilingual plane > (i.e. value >= 0x10000). Yes, as UTF-16 may be involved, which represents non-BMP characters as two "surrogate" code points. It might be helpful to repeat all observations with other, non-emoji, non-BMP characters, in order to isolate the effects. > One last thing I noticed: in ncurses, there seems to be some special > stuff to implement wcwidth and wcswidth, including a comment in > ncurses/widechar/widechars.c that says "MinGW has wide-character > functions, but they do not work correctly." As far as I can tell, > this is not enabled on cygwin; I'm not sure if it should be enabled or not. > > I hope I explained this well enough; it's a somewhat complicated issue > and I don't know all of the relevant unicode vocabulary. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |