X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=DYEN2yqM5ojk/H3l Gux11u9cTpwjv3rxuU8zmFXyaMIBhrEVBNILFY9mezO2KbPb2ok/phqVCw2sGtbt 5GOOzgtp56PCAP8hfYWVIoMEcTDWgGKBwvTBZ6IAb2M+GAOycSAA1ifC6GLu0qW5 lz+3cbgpm7olH0VDMefAQkc3iuk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=12tl0b/Ps0nOQ1DkY6VfY4 UdyUA=; b=XAZapi2/jpzrOf3Gaol4Rwmoj7iaf5sJIXDm4G5P+U8SnmJQP7rHso yonfevY+TKOB/2poec4G3+84cJrcAdIvuZtvu4SxfsVcseVbqwCt+5/sFivB6AMu nf3hl8FrpRxw919VMnXEtcqhEOTYESR2TcKc22Qsuq5ZYqKCOJil8= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.9 required=5.0 tests=AWL,BAYES_40,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.2 spammy=conclusion, plane, visually, UD:imgur.com X-HELO: mout.kundenserver.de Subject: Re: Issues with width of emoji To: cygwin AT cygwin DOT com References: From: Thomas Wolff Message-ID: <70b0e3e9-d961-983c-0c95-5a0ace8a99e6@towo.net> Date: Fri, 21 Sep 2018 19:43:13 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Am 21.09.2018 um 03:42 schrieb Pokechu22: > Hello! For a while I've had issues with emoji and cygwin, but due to > some recent configuration changes on my end it's gotten to the point > where it's actively causing problems. Some of your problem descriptions are staying a bit obscure, e.g. what recent changes have caused which problems... > My specific case involves running weechat on my rapsberry pi, which I > connect to with `mosh pi -- screen -D -RR weechat > /usr/local/bin/weechat-curses`. When someone used an emoji on IRC, > the entire screen would get messed up in some cases, as things got > misaligned (an example of this: https://i.imgur.com/V7D6jPc.png). > Previously I had a script that converted emoji into their escapes, What are "their escapes"? Emojis are encoded in Unicode directly, not needing any escapes then. > but that recently started misbehaving; even with that script there were > other unicode characters such as the mathematical alphanumeric symbols > characters () Unicode does not define any emojis in the range Mathematical Alphanumeric Symbols (U+1D400-U+1D7FF). > that caused the issue too; I'm still going to refer to these as emoji > because I most commonly have this problem wtih emoji and I don't have > a good name otherwise. > > I initially assumed that this was a problem with mosh on the pi, what > with the pi being an ARM device. However, after later investigation, > it turns out that it's a cygwin problem. Some different cases where > things behave weirdly: > > * Typing an emoji and then pressing backspace twice ends up deleting > the emoji and the character before visually, but the character before > isn't actually deleted (e.g. echo hi then backspace twice still > prints hi) See your own conclusion below. > * Running mosh, even as a loopback (`mosh --local ::1`), shows 2 > characters when the emoji is typed > * Emoji behave incorrectly when pasted into nano > * curses apps (which include mosh and nano) write a 2-wide space for > emoji, as can be seen in this script > . > This is only 1 character wide on my pi. This may be related to different Unicode versions. Width for many emojis changed from 1 to 2 in Unicode 9 (I think). > * There are no problems when using SSH, at least to my pi, interestingly. So please describe how you connect when the same test cases behave differently. > * Python refuses to create a ctypes.c_wchar containing an emoji, but > considers the len of a string with a single emoji to be 1. On my pi > it creates a c_wchar properly. > > I think that most of the desyncs and other weird things I've been > getting are a result of different systems disagreeing about how wide > the character should be; Yes, and of different applications. Do you actually run the cygwin terminal or the cygwin console for your test cases? > that makes the most sense at least. > Alternatively, it might be an issue with the character being > represented as multiple characters; as far as I can tell there are > only problems with characters outside of the basic multilingual plane > (i.e. value >= 0x10000). Yes, as UTF-16 may be involved, which represents non-BMP characters as two "surrogate" code points. It might be helpful to repeat all observations with other, non-emoji, non-BMP characters, in order to isolate the effects. > One last thing I noticed: in ncurses, there seems to be some special > stuff to implement wcwidth and wcswidth, including a comment in > ncurses/widechar/widechars.c that says "MinGW has wide-character > functions, but they do not work correctly." As far as I can tell, > this is not enabled on cygwin; I'm not sure if it should be enabled or not. > > I hope I explained this well enough; it's a somewhat complicated issue > and I don't know all of the relevant unicode vocabulary. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple