X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=UCEVlTm5Bhgpv4H1 A5qV99vF0j+L0HYDGfA6VZ6cMibQxB2p4fBS8DXYTO3q7jqe1GS0QtvOIE2w6aGL thOZuL2rmVAtKYuUVNxz8to1FhRGc9jtDPdoEGsJgcOvXaWcA3vpllIsLFMgunq8 XLE9AKvF3ikkCEUR/ikh19t6zfY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=2HV9jtgWaHJECV8CDRa7KH ALGWI=; b=jrUk3UsWlhJdRL5J+R1dtWnTkbkjvgMfGpk/JaDL9ritwlbKeApopv DE9QVI7B5j2msxWtrn4Szgyam/z4hbnibNOaqcPc1JhE3kRGxrvrJp1IICi3dgkl t2q5Q3A98YqVqBz/xTdIX3KB0E08ra80JTOCXmYQaJvVsmQens+1c= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.2 spammy=erases, apparently, erase, H*MI:sk:2017013 X-HELO: mout.kundenserver.de Subject: Re: [ANNOUNCEMENT] Updated: dash-0.5.8-3 To: cygwin AT cygwin DOT com References: <58893f48 DOT 0850ca0a DOT 6c5d DOT 5fde AT mx DOT google DOT com> <81b5af354b7a3925ff0a68dcc063265f AT smtp-cloud6 DOT xs4all DOT net> <20170131100402 DOT GB29504 AT calimero DOT vinschen DOT de> <20170131131616 DOT GC29504 AT calimero DOT vinschen DOT de> <40c92f1e987a9162742766816abb4a03 AT smtp-cloud2 DOT xs4all DOT net> <20170131153245 DOT GA8905 AT calimero DOT vinschen DOT de> From: Thomas Wolff Message-ID: <09c7b42a-7b8d-52b7-ce18-4e681eb51f05@towo.net> Date: Mon, 13 Feb 2017 23:03:11 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 MIME-Version: 1.0 In-Reply-To: <20170131153245.GA8905@calimero.vinschen.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-UI-Out-Filterresults: notjunk:1;V01:K0:ZAqZGNhLaiE=:qgd7A29TXtx+TB5bFs1GBs jVjwRAoSNuQHdq2Rg0zTJHFBlGL6IhnHab+B7jzS9IIyXfj/OTz23QjUYBBNhquZciffroAcB viXEw868nqnuq82i9soEuz9LkFAW4AA05WhTyrxlShre3x8QOx+O7YqSI8Jn9vek5JO2FAVKh yT/IqzP++iawJvu/LM+dVNiaoLOIhexGXo1sh1Sb6FKdSdNwZiQchzL5GnNEByb468I9uMjpz /uB59IWhUR4JfKdHDG4YTBY4Jt833+vyA15evss6Vg9Z8ZWjcAI3Fq9hMB1o3xTfwqvSu+RXc 4gPVzgD6IcKGdQqBoy1eJq9J2mA4uTDXTpaxfBEy1FtfhVct7Bc1RRXRGsU6CPdKs6vxYv4px tuy3CaHMv6jfoco6yQtb2KgCisgBBPsrIDPpxc4EsJgVr/tKfDT5MkB08Lnratcn7hOfkCUcn 4SZdWwKBhjPlU4c9pur1mz/cO5ZQif1OLybNUJ07oorltIcyifbxpiKes6SY7ULs72syJKRoF J+Q3JBNMAZKtMjSrkkXGxn+39tAIwLh1tRb5tC5UwOTaLjDz8OPz6XsjcScPi8yChlcF4Uewu Y23ZxbWkMJFaTKVMgPzOAobmIghK9S2eclFnH5J87h+2VAmyrnYxvBO6Bm2Fq08V43L2NgyES V8EbDzMbFDajqJNf6VL6UAc5XaMM5RLdm+buoufub89iWDS7YAS6HjlB1Xjq2jKQkzexlhXfw I58bUIq/ni1RMilW X-IsSubscribed: yes Am 31.01.2017 um 16:32 schrieb Corinna Vinschen: > On Jan 31 16:01, Houder wrote: >> On Tue, 31 Jan 2017 14:16:16, Corinna Vinschen wrote: >> >> [snip] >> >>>> I'm not quite sure yet but apparently the problem is in the handling of >>>> VERASE in the termios implementation. In cooked mode it fills a char >>>> buffer with what has been typed. The code doesn't know if the bytes in >>>> the buffer are UTF-8 chars or just random bytes. So VERASE erases >>>> exactly one byte, which means, in case of UTF-8 chars it only erases the >>>> last byte of of a mulitbyte character. >>>> >>>> ... >>> Ok, here's what happens on Linux: The termios code support a flag >>> IUTF8. This flag determines if the termios code checks for UTF8 >>> characters in the input when performing an ERASE. It checks if the >>> IUTF8 flag is set and if so, it checks in a loop if the just erased byte >>> is a UTF-8 continuation character. If so, it erases another byte. >> Agreed. One byte or more, depending on the "character" ... (which is >> not a problem in case of UTF-8 encoding -- continuation bit). >> >> Of course, the terminal driver must receive the characters encoded in UTF-8. >> >> ... > ... It's the termios implementation > inside Cygwin. I created a patch introducing the IUTF8 flag as on Linux > as well as a code snippet trying to remove entire utf-8 characters from > the input if the IUTF8 flag is set. And it's set now by default since > we default to UTF-8 anyway. > > Thomas, you may want to check for the IUTF8 flag in upcoming mintty > versions and unset it if character set configured in the mintty options > dialog is != UTF-8. So the flag is always set initially? Also on Linux? Does it (on Linux) also have an effect for non-UTF-8 multibyte encodings? And cannot the Cygwin DLL set the flag to match the locale setting when it was invoked? I can (and will if appropriate) handle the flag in mintty as needed, but what if someone calls LC_ALL=.other_encoding dash later within the terminal session? I guess the more consistent solution would be to handle this in the cygwin DLL. ------ Thomas -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple