X-Recipient: archive-cygwin@delorie.com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:mime-version:in-reply-to:references:from:date
	:message-id:subject:to:content-type; q=dns; s=default; b=v1e25TG
	lCB2I5OzSNI4qnKKAjnPEwKmLNUIbKMrh6/VCSOjXl55VhA+aDuLhV/i95wvw9Hx
	YHAn9LuixWOx+Q9HMMAvwMZtETjm8iOmq4iotRHb6WejtWccQKRDbD+R2uZ6tMVG
	YGYqH3J1R8eJ3/KSDXd+vrXz+p+ty/LTfdes=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:mime-version:in-reply-to:references:from:date
	:message-id:subject:to:content-type; s=default; bh=cVwx7ttteyPbP
	Bn+PD7OPLTvYo4=; b=SF9RV77ZxHnem+ojEfmF0elMYLu5SEncTjMkbBZmPSPjz
	5SKZp0Sq+XgqovacKGcjC/CqXzCjZcBUC8AjpFBzMhIJwPPPfqpflI1CLjfFgtiB
	K9sbdU3Npn4RbyQcRPBPAzgPXOWmMnCtkaIp1LPUPkqapuO+V7fnoI4hhnFP7Q=
Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe@cygwin.com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-help@cygwin.com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner@cygwin.com
Mail-Followup-To: cygwin@cygwin.com
Delivered-To: mailing list cygwin@cygwin.com
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=no version=3.3.2 spammy=Third, Distribution
X-HELO: mail-io0-f174.google.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=gmail.com; s=20161025;        h=mime-version:in-reply-to:references:from:date:message-id:subject:to;        bh=WVtrYravQ8xDzZvJTekBa1xbM+tCjo80ua9b7CvMDAc=;        b=kiRmnAT57NAmqwSPDUB/qEI7+SNg7kcnNRUB0mnJAGDOPas9s1nKUScCfSW5XFra5y         PWo3oFNB+iuenAiZ7zDrIRhqh3Y5MQx2aJ11R6y+e2UVuh5jh654ZDs1aTaiuv0lED8r         lu3uvDXfWiSZ81lrdS8KhEhXtiA4QlXE+td/BkeQxftabJdadRcma+P7nfPwcgJojgbK         cSGzmkkWJpX42hdKa1HjaB9lrhco3Zjsd88Al9Asv8mXkP1aZilRAPDHooLJzrv+1uss         QN0dKZF5Wa+j/BtaP+NjYiKl1mjpXhWCZqVSd/5urhAfv9f6bie8AwH/TpzdWfBmwVd5         uuzQ==
MIME-Version: 1.0
In-Reply-To: <981ba1fe-7961-5ed0-e3c7-a5717af8c141@towo.net>
References: <CAD8GWss253v-p+FjeonEqibr53v6wZRCQ+NWxBhb0LimQaM4sQ@mail.gmail.com> <1183751257.20180621042620@yandex.ru> <CAD8GWsuo3PuQSdSyMRhbxZQXa=GUSBcyes7QEaqDYfh3FCof0Q@mail.gmail.com> <5B3045B1.4080504@tlinx.org> <CAD8GWsuevQX6fBUzkEvUs5rBPehhG7-ht+FPZU=eOaACF5uCPg@mail.gmail.com> <981ba1fe-7961-5ed0-e3c7-a5717af8c141@towo.net>
From: Lee <ler762@gmail.com>
Date: Wed, 27 Jun 2018 02:25:41 -0400
Message-ID: <CAD8GWstSXHT0xFXbrzQNcOCdME7p2zRLSRffJe4BjhFuP48-Bw@mail.gmail.com>
Subject: Re: UTF-8 character encoding
To: cygwin@cygwin.com
Content-Type: text/plain; charset="UTF-8"
X-IsSubscribed: yes

On 6/26/18, Thomas Wolff  wrote:

> This encoding scheme is wrong; where did you get it from? Maybe it's the
> obsolete UTF-8...

http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt

I thought I saw something about utf-8 being able to handle a 31 bit
value..  is that also obsolete/wrong?

how about this for the current encoding scheme:
http://www.unicode.org/versions/Unicode11.0.0/ch03.pdf

Table 3-6.  UTF-8 Bit Distribution
Bits    Scalar Value               First Byte  Second Byte  Third Byte
 Fourth Byte
  7   00000000 0xxxxxxx            0xxxxxxx
 11   00000yyy yyxxxxxx            110yyyyy    10xxxxxx
 16   zzzzyyyy yyxxxxxx            1110zzzz    10yyyyyy     10xxxxxx
 21   000uuuuu zzzzyyyy yyxxxxxx   11110uuu    10uuzzzz     10yyyyyy    10xxxxxx

Lee

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

