X-Spam-Check-By: sourceware.org Date: Mon, 27 Mar 2006 23:54:15 -0500 (EST) From: Igor Peshansky Reply-To: cygwin AT cygwin DOT com To: Lapo Luchini cc: cygwin AT cygwin DOT com Subject: Re: Locales with wrong umlauts In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-559023410-1804928587-1143521655=:18642" Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com ---559023410-1804928587-1143521655=:18642 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT On Sun, 26 Mar 2006, Lapo Luchini wrote: > Igor Pechtchanski cs.nyu.edu> writes: > > > On Sun, 27 Jun 2004, A. Alper Atici wrote: > > > > > try the following: > > > set OUTPUT_CHARSET=iso-8859-1 > > > > Wow. Thanks, this was *extremely* useful. Interestingly enough, the > > OUTPUT_CHARSET option was not mentioned anywhere in the gettext/libintl > > documentation, but a search for it unearthed another couple of messages on > > this list from earlier this year with the same info[*] (one was from you). > > Extremely useful to me too, I was quite fed up to see "`a" instead of "à" =) > > I also noticed that OUTPUT_CHARSET=CP1252 *may* be preferred, compare the > following outputs: > > % mtn up > monotone: gi`a aggiornato a '1848d7dfabfbed09fe53856da038e31eed0f42dc' > % OUTPUT_CHARSET=CP1252 monotone up > monotone: già aggiornato a ÿÿ1848d7dfabfbed09fe53856da038e31eed0f42dcÿÿ > % OUTPUT_CHARSET=ISO8859-1 monotone up > monotone: già aggiornato a `1848d7dfabfbed09fe53856da038e31eed0f42dc´ > % OUTPUT_CHARSET=ISO8859-15 mtn up > monotone: già aggiornato a '1848d7dfabfbed09fe53856da038e31eed0f42dc' > > In order to really "check" it some gettext with an euro symbol should be > used, but I'm not aware of any that does and I don't have the time to > create one right now 0=) > > Instead of putting it simply in some FAQ couldn't Cygwin define that env > var correctly "by default"? (after all the system *knows* which charset > it is using, I guess?) The system has no idea what charset it's using, because it depends on the font you set for your terminal, which is outside of the terminal's control. Even if you use a Unicode font with charset conversion, the charset is specified outside of the console. Incidentally, since this subject came up: ls has a "--show-control-chars" option, but rm, mv, cp, and a bunch of other tools don't. So, if you run rm in interactive mode, it doesn't display filenames properly. For example: $ touch é a $ ls é é $ mv -i a é mv: overwrite `\351'? n $ rm -i é rm: remove regular empty file `\351'? y $ Is there any way to tell mv, rm &co to display non-ASCII characters in filenames? I know this isn't Cygwin-specific, but I'm not even sure what to Google for. Eric? Igor -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_ pechtcha AT cs DOT nyu DOT edu | igor AT watson DOT ibm DOT com ZZZzz /,`.-'`' -. ;-;;,_ Igor Peshansky, Ph.D. (name changed!) |,4- ) )-,_. ,\ ( `'-' old name: Igor Pechtchanski '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! "Las! je suis sot... -Mais non, tu ne l'es pas, puisque tu t'en rends compte." "But no -- you are no fool; you call yourself a fool, there's proof enough in that!" -- Rostand, "Cyrano de Bergerac" ---559023410-1804928587-1143521655=:18642 Content-Type: text/plain; charset=us-ascii -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ ---559023410-1804928587-1143521655=:18642--