delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/05/29/14:42:54

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-0.6 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_41,SARE_MSGID_LONG40,SPF_PASS
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
In-Reply-To: <4A200BC0.9010704@sidefx.com>
References: <200905281541 DOT 33404 DOT michael DOT renner AT gmx DOT de> <4A1EAAED DOT 1060702 AT cygwin DOT com> <4A1EAD61 DOT 5010308 AT sidefx DOT com> <4A1EAD91 DOT 1060701 AT sidefx DOT com> <e2480c70905281131u37651a2eoba946637bd414516 AT mail DOT gmail DOT com> <4A1EF2CE DOT 2060509 AT sidefx DOT com> <3f0ad08d0905290813m39999f81q918e94e3c960eb3f AT mail DOT gmail DOT com> <4A200287 DOT 8030403 AT sidefx DOT com> <3f0ad08d0905290852xe41338alfda89c622f92f677 AT mail DOT gmail DOT com> <4A200BC0 DOT 9010704 AT sidefx DOT com>
Date: Fri, 29 May 2009 22:42:32 +0400
Message-ID: <e2480c70905291142o2bcc65ccw2287d175dbd09dd5@mail.gmail.com>
Subject: Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line
From: Alexey Borzenkov <snaury AT gmail DOT com>
To: cygwin AT cygwin DOT com
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

On Fri, May 29, 2009 at 8:22 PM, Edward Lam <edward AT sidefx DOT com> wrote:
> I think there is still a bug here? I set LANG=C, then shouldn't be just NOT
> doing any encoding, thus work? If I do this on Linux, it works. If I use a
> cygwin compiled app, it also works.

On Linux, internally, system uses multibyte strings (it is encoding
agnostic even), but on Windows, system uses unicode strings, so cygwin
has to decode your byte sequences somehow to pass them to non-cygwin
processes as unicode (the fact that cygwin now understands unicode is
a huge plus to me). In earlier discussions it was decided that cygwin
C locale should use utf-8 encoding, because file system internally
uses unicode it's the safest default to represent all possible
filenames, etc. In previous cygwin versions, your byte sequences were
just silently converted using your system's codepage (by the system
itself, even), so if you want the old behavior you should set
LANG=en_US.CP1252.

The only bug here is that the arguments are truncated instead of using
some kind of a replacement character, is it related to some posix
complience, like with wprintf?

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019