X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,SARE_SUB_ENC_UTF8,SPF_PASS X-Spam-Check-By: sourceware.org Message-ID: <49E76991.6090106@gmail.com> Date: Thu, 16 Apr 2009 18:23:29 +0100 From: Dave Korn User-Agent: Thunderbird 2.0.0.17 (Windows/20080914) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: UTF-8 problem/bug with Cygwin 1.7 References: <49E75CE7 DOT 4010004 AT danbbs DOT dk> In-Reply-To: <49E75CE7.4010004@danbbs.dk> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Gunnar Degnbol wrote: > I have a strange problem with UTF-8 characters when running bash from > the Windows command line. I hoped it would go away with the new Cygwin > 1.7.0-46, but it is still there. Can't reproduce this with either -45 or -46: Microsoft Windows 2000 [Version 5.00.2195] (C) Copyright 1985-2000 Microsoft Corp. C:\>F: F:\>cd cygwin-1.7 F:\cygwin-1.7>cd bin F:\cygwin-1.7\bin>.\echo £ £ F:\cygwin-1.7\bin>bash -c 'echo a' a F:\cygwin-1.7\bin>bash -c 'echo £' £ F:\cygwin-1.7\bin>set LANG=en_US.UTF-8 F:\cygwin-1.7\bin>bash -c 'echo £' £ F:\cygwin-1.7\bin>cygcheck -c cygwin Cygwin Package Information Package Version Status cygwin 1.7.0-45 OK F:\cygwin-1.7\bin> [ runs setup.exe to upgrade ] F:\cygwin-1.7\bin>cygcheck -c cygwin Cygwin Package Information Package Version Status cygwin 1.7.0-46 OK F:\cygwin-1.7\bin>bash -c 'echo £' £ F:\cygwin-1.7\bin> > C:\cygwin17\bin>bash -c 'echo £' > bash: $'echo \302\243': command not found > > C:\cygwin17\bin>bash -c "$'echo \302\243'" > bash: $'echo \302\243': command not found > > C:\cygwin17\bin>bash -c "echo $'\302\243'" > £ > > It works if I don't set LANG, or set it to something else than UTF-8. It > also works in Cygwin 1.5. > Seems like UTF-8 makes bash escape the whole command line if it contains > non-ascii characters. Maybe it should only escape the non-ascii > characters? This might still cause problems with text in quotes. I'm not very familiar with this "$'" construct, but yes, it's blatantly wrong to quote the whole line using since then it gets treated as a single word for parsing purposes. I have no idea who is doing this escaping and why I'm not getting it though. cheers, DaveK -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/