delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2018/09/21/13:43:29

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:message-id:date
:mime-version:in-reply-to:content-type
:content-transfer-encoding; q=dns; s=default; b=DYEN2yqM5ojk/H3l
Gux11u9cTpwjv3rxuU8zmFXyaMIBhrEVBNILFY9mezO2KbPb2ok/phqVCw2sGtbt
5GOOzgtp56PCAP8hfYWVIoMEcTDWgGKBwvTBZ6IAb2M+GAOycSAA1ifC6GLu0qW5
lz+3cbgpm7olH0VDMefAQkc3iuk=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:message-id:date
:mime-version:in-reply-to:content-type
:content-transfer-encoding; s=default; bh=12tl0b/Ps0nOQ1DkY6VfY4
UdyUA=; b=XAZapi2/jpzrOf3Gaol4Rwmoj7iaf5sJIXDm4G5P+U8SnmJQP7rHso
yonfevY+TKOB/2poec4G3+84cJrcAdIvuZtvu4SxfsVcseVbqwCt+5/sFivB6AMu
nf3hl8FrpRxw919VMnXEtcqhEOTYESR2TcKc22Qsuq5ZYqKCOJil8=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=0.9 required=5.0 tests=AWL,BAYES_40,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.2 spammy=conclusion, plane, visually, UD:imgur.com
X-HELO: mout.kundenserver.de
Subject: Re: Issues with width of emoji
To: cygwin AT cygwin DOT com
References: <CAEpCGDR1DDu1rPUge1H_LoGDP0gG4fd6S+BDKZ=f9oni6=eU8Q AT mail DOT gmail DOT com>
From: Thomas Wolff <towo AT towo DOT net>
Message-ID: <70b0e3e9-d961-983c-0c95-5a0ace8a99e6@towo.net>
Date: Fri, 21 Sep 2018 19:43:13 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <CAEpCGDR1DDu1rPUge1H_LoGDP0gG4fd6S+BDKZ=f9oni6=eU8Q@mail.gmail.com>
X-IsSubscribed: yes

Am 21.09.2018 um 03:42 schrieb Pokechu22:
> Hello!  For a while I've had issues with emoji and cygwin, but due to
> some recent configuration changes on my end it's gotten to the point
> where it's actively causing problems.
Some of your problem descriptions are staying a bit obscure, e.g. what 
recent changes have caused which problems...

> My specific case involves running weechat on my rapsberry pi, which I
> connect to with `mosh pi -- screen -D -RR weechat
> /usr/local/bin/weechat-curses`.  When someone used an emoji on IRC,
> the entire screen would get messed up in some cases, as things got
> misaligned (an example of this: https://i.imgur.com/V7D6jPc.png).
> Previously I had a script that converted emoji into their escapes,
What are "their escapes"? Emojis are encoded in Unicode directly, not 
needing any escapes then.

> but that recently started misbehaving; even with that script there were
> other unicode characters such as the mathematical alphanumeric symbols
> characters (<https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols>)
Unicode does not define any emojis in the range Mathematical 
Alphanumeric Symbols (U+1D400-U+1D7FF).

> that caused the issue too; I'm still going to refer to these as emoji
> because I most commonly have this problem wtih emoji and I don't have
> a good name otherwise.
>
> I initially assumed that this was a problem with mosh on the pi, what
> with the pi being an ARM device.  However, after later investigation,
> it turns out that it's a cygwin problem.  Some different cases where
> things behave weirdly:
>
> * Typing an emoji and then pressing backspace twice ends up deleting
> the emoji and the character before visually, but the character before
> isn't actually deleted (e.g. echo hi<emoji> then backspace twice still
> prints hi)
See your own conclusion below.
> * Running mosh, even as a loopback (`mosh --local ::1`), shows 2
> characters when the emoji is typed
> * Emoji behave incorrectly when pasted into nano
> * curses apps (which include mosh and nano) write a 2-wide space for
> emoji, as can be seen in this script
> <https://gist.github.com/Pokechu22/45d19aa5e41ee6db00723f808ac4339e>.
> This is only 1 character wide on my pi.
This may be related to different Unicode versions. Width for many emojis 
changed from 1 to 2 in Unicode 9 (I think).
> * There are no problems when using SSH, at least to my pi, interestingly.
So please describe how you connect when the same test cases behave 
differently.
> * Python refuses to create a ctypes.c_wchar containing an emoji, but
> considers the len of a string with a single emoji to be 1.  On my pi
> it creates a c_wchar properly.
>
> I think that most of the desyncs and other weird things I've been
> getting are a result of different systems disagreeing about how wide
> the character should be;
Yes, and of different applications. Do you actually run the cygwin 
terminal or the cygwin console for your test cases?

> that makes the most sense at least.
> Alternatively, it might be an issue with the character being
> represented as multiple characters; as far as I can tell there are
> only problems with characters outside of the basic multilingual plane
> (i.e. value >= 0x10000).
Yes, as UTF-16 may be involved, which represents non-BMP characters as 
two "surrogate" code points.
It might be helpful to repeat all observations with other, non-emoji, 
non-BMP characters, in order to isolate the effects.

> One last thing I noticed: in ncurses, there seems to be some special
> stuff to implement wcwidth and wcswidth, including a comment in
> ncurses/widechar/widechars.c that says "MinGW has wide-character
> functions, but they do not work correctly."  As far as I can tell,
> this is not enabled on cygwin; I'm not sure if it should be enabled or not.
>
> I hope I explained this well enough; it's a somewhat complicated issue
> and I don't know all of the relevant unicode vocabulary.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019