delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/06/23/11:05:07

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00
X-Spam-Check-By: sourceware.org
Date: Tue, 23 Jun 2009 17:04:33 +0200 (CEST)
Message-Id: <200906231504.n5NF4Xiv027571@mail.bln1.bf.nsn-intra.net>
From: Thomas Wolff <towo AT towo DOT net>
To: cygwin AT cygwin DOT com
Subject: Re: default codepage
References: <200906221448 DOT n5MEmF1r018726 AT mail DOT bln1 DOT bf DOT nsn-intra DOT net> <200906231345 DOT n5NDj9i1026763 AT mail DOT bln1 DOT bf DOT nsn-intra DOT net> <20090623140643 DOT GB3024 AT calimero DOT vinschen DOT de>
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

> > > > On Jun 22 16:48, Thomas Wolff wrote:
> > > > > Since the latest locale-related changes, the default codepage after 
> > > > > starting cygwin _without_ explicit setting (of a locale variable) 
> > > > > seems to have changed from CP1252 ("Windows ANSI") to ISO 8859-1 ("Latin 1").
> > > > > Was this change on purpose?
> > > > 
> ...
> I tested this myself and now I understand what you mean.  The console
> seems to use ISO-8859-1, but actually it doesn't.  What happens is this:
> The console I/O functions are using UTF-16 under the hood, so each
> incoming character is converted to Unicode.  The ASCII->Unicode
> conversion treats all incoming bytes literally.  Since the Unicode
> values from 0x80 to 0xff are derived from the ISO-8859-1 table, you
> actually see ISO-8859-1 by default on the console.
Understood; which means the effective codepage of the terminal is 
ISO-8859-1 (by whatever mechanism this is achieved). Maybe wcwidth 
etc. have a different opinion in this configuration (which I haven't 
tested) which might however raise additional problems.

> So here's the question:  Why is that a problem?  It's just the default
> output.  I *can't* use CP1252 as default, because it's only a valid
> default on western language versions of Windows.  Rather I would have to
> use the defualt ANSI codepage, whatever that is on the machine.
OK, if that's how it was in 1.5, it would be fine.
> ISO-8859-1 OTOH is the least intrusive default since it allows a
> representation on all machines, independent of their default ANSI
> codepage.
The new approach is not a problem for me. I was just wondering about 
compatibility issues and pondering that keeping the 1.5 default might 
reduce the number of complaints from various users on this mailing list 
later when 1.7 goes mainstream...

But wait - yet here's my question: Why is there a difference between 
	bash --login
and
	bash
- where in the latter case CP1252 (or the default ANSI codepage) 
*is* still the default?

Thomas

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019