delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/06/03/12:02:55

X-Recipient: archive-cygwin AT delorie DOT com
X-Spam-Check-By: sourceware.org
Date: Wed, 3 Jun 2009 12:02:25 -0400
From: Christopher Faylor <cgf-use-the-mailinglist-please AT cygwin DOT com>
To: cygwin AT cygwin DOT com
Subject: Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line
Message-ID: <20090603160225.GA27039@ednor.casa.cgf.cx>
Reply-To: cygwin AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
References: <4A200287 DOT 8030403 AT sidefx DOT com> <3f0ad08d0905290852xe41338alfda89c622f92f677 AT mail DOT gmail DOT com> <4A200BC0 DOT 9010704 AT sidefx DOT com> <e2480c70905291142o2bcc65ccw2287d175dbd09dd5 AT mail DOT gmail DOT com> <4A204149 DOT 2050009 AT sidefx DOT com> <e2480c70905291337g6c8bcca7xd0baba79c84629db AT mail DOT gmail DOT com> <4A2051E5 DOT 6060600 AT sidefx DOT com> <20090602205440 DOT GF23519 AT calimero DOT vinschen DOT de> <4A26782C DOT 9040207 AT sidefx DOT com> <20090603142755 DOT GM23519 AT calimero DOT vinschen DOT de>
MIME-Version: 1.0
In-Reply-To: <20090603142755.GM23519@calimero.vinschen.de>
User-Agent: Mutt/1.5.19 (2009-01-05)
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Wed, Jun 03, 2009 at 04:27:55PM +0200, Corinna Vinschen wrote:
>On Jun  3 09:18, Edward Lam wrote:
>> Corinna Vinschen wrote:
>>> The question is, what do you expect?  [...]
>> [...]
>> Wikipedia has several suggestions on how to handle invalid UTF-8 byte  
>> sequences (http://en.wikipedia.org/wiki/UTF-8). Personally, I favor the  
>> rule that uses the replacement character.
>
>Chris implemented using the invalid code point solution.  The discussion
>in http://www.mail-archive.com/linux-utf8 AT nl DOT linux DOT org/msg00080.html
>supports this solution.  What's missing so far is the way back, from
>an invalid single second half of a surrogate pair in the 0xDCxx range
>back to the correct byte value.  I'm just looking into that.

The way back was not, AFAIK, needed for Cygwin programs.  I don't think
there is a valid way back for Windows programs.

cgf

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019