delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/06/26/12:52:20

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00,SPF_PASS
X-Spam-Check-By: sourceware.org
Message-ID: <4A44FC98.2050500@aim.com>
Date: Fri, 26 Jun 2009 12:51:36 -0400
From: Mark Harig <idirectscm AT aim DOT com>
User-Agent: Thunderbird 2.0.0.21 (Windows/20090302)
MIME-Version: 1.0
To: The Cygwin Mailing List <cygwin AT cygwin DOT com>
Subject: Re: Problem with displaying ASCII table in mintty
X-AOL-IP: 205.188.169.199
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

Fri, 26 Jun 2009 10:50:45 -0400 Mark J. Reed wrote:

> > Is is possible to display the upper 128 entries in the ASCII
> > table in mintty using the 'cygutils' application 'ascii'?
>
> The ASCII table doesn't have an upper 128 entries.  Only codes 0
> through 127 decimal are defined by ASCII.  Once you hit 128 you're not
> in ASCII anymore, and what you *are* in depends entirely on what code
> page you're using.
>
> 128 through 159 are control characters in Unicode and Latin-1, but
> printable characters in Windows 1252.  160 through 255 are the same in
> Windows 1252, Latin-1, and Unicode, but defined differently in the
> other ISO-8859 and ISO-2022 character sets and Windows code pages.
>
> If you're using UTF-8 (a particular way of representing Unicode
> characters, which are defined as numbers, as concrete bits and bytes),
> then only characters 0 through 127 can be expressed in one byte.
> Characters from 128 to 2047 take two bytes; the rest of the BMP (2048
> through 65536)  three bytes per character, and the rest of Unicode
> four bytes per character.
>
> So if you just send the byte with decimal value 128, not preceded by
> the start of a UTF-8 sequence, to a UTF-8 terminal, the terminal will
> reject it as invalid, or display gobbledygook, depending on its error
> handling design.

Thank you for the explanation.  I see from the manual page for 'ascii'
provided at http://www.kernel.org/pub/linux/docs/manpages/ that the
ASCII table is as you described (that is, 128 entries only).  Do you have
any recommendations about what the utility program /usr/bin/ascii
(in the package 'cygutils') should do?  Should it not provide a display
of the values above 128 because they are not part of the ASCII table?
Does it make sense to provide options that handle the values above 128
under the various conditions described above?


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019