X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type; q=dns; s=default; b=b/nh BiTF/h3nZ6cgnh/HyLiqFHjv0O3f1nFUaSeL7pB5DmfRaGAn6uIYN725f7RyVnm6 17mhb0+3i251pT5nqexqzg3plCagko0E2MO3kfAV+xDH9cCSzrcZQLPOmGPkq/P/ x+voDKX4PdPBYH1iYNWEqfLEfLdfFmPaRq11KHA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type; s=default; bh=nsC59vaKmq NZstC3OMPtMbTOD9g=; b=a+mkPkho4zlOc4kwYGVw4c0DnFRijx0/tJD6dQw/qh +cLCiap5ume7x9AkUzvYu75f6UJbv3mk8gWKKDQTLYigPwB6sqOWVOzQjjPp7aJ5 YG5UajMayVYMLA5ZJjNX4ZsZtWZeb6vCZ8TZ9jPzjvB8fmN+grL1wsBQjlYKD/t1 4= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=transcription, SMALL, overwhelmed X-HELO: mx1.redhat.com DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 12BDCEB2F6 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=eblake AT redhat DOT com Subject: Re: [ANNOUNCEMENT] Updated: libreadline7-7.0.3-3 To: cygwin AT cygwin DOT com References: <5dbbf0e4-6374-a9bb-21e5-dd5537e0e19a AT redhat DOT com> <597a3771 DOT 4305ca0a DOT 32253 DOT e788 AT mx DOT google DOT com> From: Eric Blake Openpgp: url=http://people.redhat.com/eblake/eblake.gpg Message-ID: Date: Thu, 27 Jul 2017 16:37:45 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <597a3771.4305ca0a.32253.e788@mx.google.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="JpJqicUqdpA8c6bO4NpXMC3xQcb0Ju1lF" X-IsSubscribed: yes --JpJqicUqdpA8c6bO4NpXMC3xQcb0Ju1lF Content-Type: multipart/mixed; boundary="b4QhtDrbUXEqKuh1dKHO6gHWss1HwThX7"; protected-headers="v1" From: Eric Blake To: cygwin AT cygwin DOT com Message-ID: Subject: Re: [ANNOUNCEMENT] Updated: libreadline7-7.0.3-3 References: <5dbbf0e4-6374-a9bb-21e5-dd5537e0e19a AT redhat DOT com> <597a3771 DOT 4305ca0a DOT 32253 DOT e788 AT mx DOT google DOT com> In-Reply-To: <597a3771 DOT 4305ca0a DOT 32253 DOT e788 AT mx DOT google DOT com> --b4QhtDrbUXEqKuh1dKHO6gHWss1HwThX7 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 07/27/2017 01:56 PM, Steven Penny wrote: > On Thu, 27 Jul 2017 12:08:53, Eric Blake wrote: >> I've got some time today to look at building readline, but for the life >> of me, I can't figure out what I'm supposed to be debugging. You have >> so many emails saying "see this earlier URL" that I am lost in what you >> are saying is wrong or how to reproduce it. >=20 > Thanks for this. Between your 2 emails, youve put a lot on the table. > Instead > of getting overwhelmed, I will just start my side of the convo by > replaying the > problem. Then if you need more from me I am happy to help. So, here is an > example problem using LATIN SMALL LETTER O WITH DIAERESIS' (U+00F6): >=20 > $ chcp.com 65001 I still don't know your environment (it's really hard to reproduce issues if I don't know the steps to reproduce them). This looks like a bash prompt, but are you running bash inside mintty, or directly in a cmd window? When I first open a mintty window to get bash, I see: $ chcp.com Active code page: 437 and in that environment, typing displays nothing, but hitting then displays: -bash: $'\302\224': command not found which maps to \xc2\x94; I can confirm that with 'od -tx1'. Trying gives a different character (=C2=A6), as \xc2\xa6. When I then do $chcp.com 65001 Active code page 65001 I don't see any change in behavior. But if I first open a cmd window, with NO bash in the mix, I see: c:\cygwin\bin> chcp Active code page: 437 where both and output =C3=B6, and where 'od -tx1' confirms both sequences produce \xc3\xb6. Then switching code pages: c:\cygwin\bin> chcp 65001 Active code page: 65001 directly typing prints nothing, while 'od -tx1' still shows that it received \xc3\xb6. I have no idea how alt- sequences are mapped to code points (it is not as trivial as a conversion of base to get either the Unicode code-point of 0x96 or to the UTF-8 encoding), but it appears that the input within cmd is the same, while the choice of code page determines what the output will be. I also have no idea why the alt- sequences produce different inputs under cmd than under mintty. So knowing WHAT environment you are using is VITAL to me understanding the results you are seeing. At any rate, I definitely know that U+00F6 is encoded as \xc3\xb6 in UTF-8 (I confirmed that on Linux, with echo $'\xc3\xb6'). I _don't_ know what it is encoded as in Windows code page 437 or 65001. But a quick google later, and I see that for code page 437 (https://en.wikipedia.org/wiki/Code_page_437), =C3=B6 is at codepoint 0x94 (decimal 148, octal 0224); meanwhile, 0xf6 is equal to decimal 246. Aha - maybe that explains the two alt- sequences under codepage 437: without a leading zero, you are typing the decimal position which looks up the character from the current code page; WITH a leading zero you are directly requesting the decimal encoding of a Unicode character. And trying some other sequences, I note that =C3=B5 (LATIN SMALL LETTER O WITH TILDE' (U+00F5)) is not part of code page 437; so there is nothing I can type without a leading 0 to print one; conversely, trying which requests the same unicode character displays merely 'o' (apparently U+006f), which, when you lack o-with-tilde, is a reasonable fallback compared to printing nothing at all. Either way, the character requested by the alt-sequence in the cmd window is then transformed by Cygwin into the appropriate UTF-8 input for the tty stdin of the Cygwin child process. Hmm; repeating those sequences under 'od -tx1', when I try , I see something interesting: the moment I press 5 (while still holding alt), the display prints [G; then releasing alt prints o; the transcription is then 0000000 1b 1b 5b 47 c3 b5 0a which is ESC ESC [ G (hmm - that's the ANSI terminal escape sequence for moving to column 0), followed by the actual Unicode =C3=B5, before my ending newline. No idea why that is leaking through to Cygwin to pick up as input. Is windows trying to beep at me to tell me my Unicode request doesn't exist in the current code page? Except that beep is Ctrl-G (U+0007). But when I switch to code page 65001, wikipedia redirects me to UTF-8. So in that code page, presumably all ALT sequences represent themselves, whether or not there is a leading 0? No, experimentation shows otherwise: shows nothing (and not the smiley face from codepage 437); while shows ^B (where ctrl-b really is code point 2). I have no idea WHAT sequence would thus give you =C3=B6. > Now you might say, why not just use codepage 437? Which is exactly what > Corinna > did say: >=20 > http://cygwin.com/ml/cygwin/2017-03/msg00193.html Well, obviously, the code page matters to cmd; and I have no idea what alt- sequences do (or are supposed to do) under mintty. So there may STILL be some lingering craziness on what Cygwin itself should do when it recognizes an alt- sequence coming in (if cygwin translates from the current code page to Unicode, where the current code page definitely affects which character is desired); and that's _in addition_ to what appears to be the craziness in bash when reconstructing the UTF-8 sequence for omega =CE=A9 as mentioned in my other mail. --=20 Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org --b4QhtDrbUXEqKuh1dKHO6gHWss1HwThX7-- --JpJqicUqdpA8c6bO4NpXMC3xQcb0Ju1lF Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEzBAEBCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAll6XSkACgkQp6FrSiUn Q2o8wQgAmmEzVhc4X2WZQDJdnxwI522X6WAcuqk28ueBr1jxpdZZYFbf5+wvhnru KtOppdaZ8s7UbglE/3GPxpUavPNwF/Oiq6Lm7n2w09BhqYb9pmU6/3V/G4t0mP5b oKE+rB6dgt079Vn+GPD2UpXNLROJlQOihfB/9YOHKnpus0j3FcHUPf4p5dAWCBE6 6pxmieEFJk2n1FqAtyxSP1sthVf4ySK1s57Rmo2dqc3XQGh3JSu6lu8AGT2F3MSQ JiYQ2Csv4uyu4SoT//mZHT2SnIMHV3z54yyLXw+6a0Hy5xhw2DfNFDziEqljlZeg +qyYIP4s8ck3sAC8AcnfIHz+16lk5w== =tab3 -----END PGP SIGNATURE----- --JpJqicUqdpA8c6bO4NpXMC3xQcb0Ju1lF--