delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/06/26/10:51:05

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SPF_PASS
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
In-Reply-To: <4A442339.8060406@aim.com>
References: <4A442339 DOT 8060406 AT aim DOT com>
Date: Fri, 26 Jun 2009 10:50:45 -0400
Message-ID: <f60fe000906260750p32eaa403wb0cb06af070b3be6@mail.gmail.com>
Subject: Re: Problem with displaying ASCII table in mintty
From: "Mark J. Reed" <markjreed AT gmail DOT com>
To: cygwin AT cygwin DOT com
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Thu, Jun 25, 2009 at 9:24 PM, Mark Harig wrote:
> Is is possible to display the upper 128 entries in the ASCII
> table in mintty using the 'cygutils' application 'ascii'?

The ASCII table doesn't have an upper 128 entries.  Only codes 0
through 127 decimal are defined by ASCII.  Once you hit 128 you're not
in ASCII anymore, and what you *are* in depends entirely on what code
page you're using.

128 through 159 are control characters in Unicode and Latin-1, but
printable characters in Windows 1252.  160 through 255 are the same in
Windows 1252, Latin-1, and Unicode, but defined differently in the
other ISO-8859 and ISO-2022 character sets and Windows code pages.

If you're using UTF-8 (a particular way of representing Unicode
characters, which are defined as numbers, as concrete bits and bytes),
then only characters 0 through 127 can be expressed in one byte.
Characters from 128 to 2047 take two bytes; the rest of the BMP (2048
through 65536)  three bytes per character, and the rest of Unicode
four bytes per character.

So if you just send the byte with decimal value 128, not preceded by
the start of a UTF-8 sequence, to a UTF-8 terminal, the terminal will
reject it as invalid, or display gobbledygook, depending on its error
handling design.

-- 
Mark J. Reed <markjreed AT gmail DOT com>

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019