X-Recipient: archive-cygwin@delorie.com X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SPF_PASS X-Spam-Check-By: sourceware.org MIME-Version: 1.0 In-Reply-To: References: <416096c60908300959i1e0084b1xc8f6e65e792b035d@mail.gmail.com> <20090831005258.GG2068@ednor.casa.cgf.cx> <416096c60909012329l2f25e735yc07145b8d6698cda@mail.gmail.com> <3f0ad08d0909020656v7d9fce6ft4afea63ed363b9a9@mail.gmail.com> <416096c60909071308qc5ff057sbe9cb1dbc270554f@mail.gmail.com> <20090908193456.GC17515@calimero.vinschen.de> <416096c60909081449r1fe024dbm7b82a3719be05e9e@mail.gmail.com> <20090921103758.GE20981@calimero.vinschen.de> <416096c60909211420g4ac8ea93l80fc1f00dcd5c0f3@mail.gmail.com> Date: Tue, 22 Sep 2009 07:47:44 +0100 Message-ID: <416096c60909212347r7e03a4f3q7d518ff7e8bce55d@mail.gmail.com> Subject: Re: The C locale From: Andy Koppe To: cygwin@cygwin.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Delivered-To: mailing list cygwin@cygwin.com 2009/9/22 Lapo Luchini: > Andy Koppe wrote: >> This way, the non-ASCII needs of most users are covered >> out-of-the-box [...] >> Windows filenames show up correctly in Cygwin as long as they're >> limited to the ANSI codepage. > > I fail to see how that is a desiderable thing. > Filesystem is UTF-16, Cygwin is now Unicode-aware, but anything that > doesn't fit ANSI is thrown away [...]? No, it isn't. UTF-16 filename characters that can't be represented in the current charset are encoded by a ^N followed by the character's UTF-8 representation. The current C locale, on the other hand, simply represents all non-ASCII characters as UTF-8, even though the application charset is ISO-8859-1. This means that even those characters that can be represented in the application charset show up incorrectly. For example, a Windows filename "b=C3=A4h" turns into "b=C3=85=C2=A4h" in the C= locale, while it shows up correctly with explicitly set ISO-8859-1 or CP1252. > As a user, the ability to show correctly formatted UTF-8 filenames is > one of the features I most appreciated in Cygwin-1.7 That ability isn't going anywhere. As before, you need to set your locale to one with a UTF-8 charset to get full UTF-8 support. Btw, are you actually using the C locale? Andy -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple