X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-0.5 required=5.0 tests=BAYES_00,RCVD_NUMERIC_HELO,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org To: cygwin AT cygwin DOT com From: Ronald Fischer Subject: Encoding of German 'umlauts' - please explain Date: Thu, 24 Sep 2009 08:17:42 +0000 (UTC) Lines: 23 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit User-Agent: Loom/3.14 (http://gmane.org/) X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Maybe someone could enlighten me about the following: On Cygwin bash I see $ echo ü | od -cx 0000000 374 \n 0afc 0000002 That means, the German letter ü has encoding 0xFC. If I do the same on CMD shell (the 'od' used here comes from the Gnu Utilities for Windows), I see: echo ü | od -cx 0000000 201 \r \n 2081 0a0d 0000004 That is, ü is encoded as 0x81. Why is this different? I am aware that, for historic reason, different encodings exist (the old DOS encoding, Windows ANSI encoding etc.). I wouldn't have expected those differences, however, when comparing bash.exe vs. cmd.exe. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple