X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-1.6 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_41,SARE_MSGID_LONG40,SPF_PASS X-Spam-Check-By: sourceware.org MIME-Version: 1.0 In-Reply-To: References: <416096c60908300959i1e0084b1xc8f6e65e792b035d AT mail DOT gmail DOT com> <3f0ad08d0909020656v7d9fce6ft4afea63ed363b9a9 AT mail DOT gmail DOT com> <416096c60909071308qc5ff057sbe9cb1dbc270554f AT mail DOT gmail DOT com> <20090908193456 DOT GC17515 AT calimero DOT vinschen DOT de> <416096c60909081449r1fe024dbm7b82a3719be05e9e AT mail DOT gmail DOT com> <20090921103758 DOT GE20981 AT calimero DOT vinschen DOT de> <416096c60909211420g4ac8ea93l80fc1f00dcd5c0f3 AT mail DOT gmail DOT com> <416096c60909212347r7e03a4f3q7d518ff7e8bce55d AT mail DOT gmail DOT com> Date: Tue, 22 Sep 2009 13:49:54 +0100 Message-ID: <416096c60909220549jaa601d9l26621e9910136a3@mail.gmail.com> Subject: Re: The C locale From: Andy Koppe To: cygwin AT cygwin DOT com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com 2009/9/22 Lapo Luchini: >> For example, a Windows filename "b=C3=A4h" turns into "b=C3=85=C2=A4h" i= n the C locale, >> while it shows up correctly with explicitly set ISO-8859-1 or CP1252. > > Uh? Doesn't seem so to me: if I create "b=C3=A4h" in WindowsExplorer, then > open up an UTF-8 mintty console I have a consistent output with both > LANG=3DC and LANG=3Dit_IT.UTF-8 (of course, since right now C is UTF-8): > > % LANG=3DC ls -l|egrep b.h > -rw-r--r-- 1 lapo None =C2=A0 =C2=A0 0 Sep 22 09:53 b=C3=A4h > % LANG=3Dit_IT.UTF-8 ls -l|egrep b.h > -rw-r--r-- 1 lapo None =C2=A0 =C2=A0 0 22 Sep 09:53 b=C3=A4h You've presumably got mintty set to UTF-8, hence mintty's output conversion turned ls's ISO-8859-1 "=C3=85=C2=A4" (i.e. "\xC3\xA4") into "= =C3=A4". > So I'm not sure what do you mean with 'a Windows filename "b=C3=A4h" turns > into "b=C3=85=C2=A4h" in the C locale'... you mean that a script sees it = as > 62C3A468 as opposed as 62E468? Or that actual "b=C3=85=C2=A4h" is shown s= omewhere? Both. For the latter, try it in the default Cygwin console, without any locale variables set. > But OTOH as far as "not caring" goes, it sure can be a nice feature to > be retro-compatible in that single case Thanks. Unfortunately the "C" locale is rather important though, because that's what people will be using unless they go to the effort of finding out how to set a different locale. Andy -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple