X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Tue, 6 Oct 2009 18:08:00 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: [ANNOUNCEMENT] [1.7] Updated: cygwin-1.7.0-62 Message-ID: <20091006160800.GS12789@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: <4ACB6309 DOT 9020609 AT cornell DOT edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4ACB6309.9020609@cornell.edu> User-Agent: Mutt/1.5.17 (2007-11-01) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Oct 6 11:32, Ken Brown wrote: > On 10/3/2009 9:59 AM, Corinna Vinschen wrote: >> Apart from bugfixes, this patch contains a change to the >> internationalization efforts in Cygwin which cristalized out of a couple >> of longish discussions on the cygwin and cygwin-developer lists. >> >> Here's how it's supposed to work in future: > [...] >> - The "C" locale's default charset is UTF-8. > > Does this mean that non-ASCII characters are supposed to display OOTB, or > is some user configuration expected? Here's a test case. > > I've tried to view the attached file (extracted from the output of fc-list) > in various ways, and here's what I've found (running XP in the U.S., with > no language-related customization): > > - Using emacs under X, emacs recognizes the file as UTF-8 and displays the > foreign characters correctly. > > - 'cat temp.txt' in the cygwin console produces lots of question marks. I don't understand this. Are you sure you're running the latest -62 release? Without any environment setting (LC_ALL, LC_CTYPE, LANG), the console is using UTF-8 by default, just like anything else. If I call `cat temp.txt', I get a selection of the finest native characters (looks like a mix of eastern european umlauts, greek, and russian). With vim, I get a few weird characters which appears to be related to the fact that vim doesn't really recognize the file as UTF-8. As soon as I set $LANG to (for instance) C.UTF-8, vim is happy as well. Alternatively, `:set encoding=utf-8' in vim is sufficent as well. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple