delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2010/03/16/17:07:52

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
In-Reply-To: <21BD29CF0D2D42A1ABD17B6AEB918C77@pleaset>
References: <493F5820D3F64434A76F433604C79D4A AT pleaset> <416096c61003160019p24e58433x4a969c0f99068fa6 AT mail DOT gmail DOT com> <6C05DF4D85804B3A865E7FE549B0475E AT pleaset> <416096c61003161315p504dff5dn7d1e847db01754c8 AT mail DOT gmail DOT com> <21BD29CF0D2D42A1ABD17B6AEB918C77 AT pleaset>
Date: Tue, 16 Mar 2010 22:07:40 +0000
Message-ID: <416096c61003161507l2387db44t99685c8e6fa2400f@mail.gmail.com>
Subject: Re: filenames with characters that have the high bit set
From: Andy Koppe <andy DOT koppe AT gmail DOT com>
To: dbyron AT dbyron DOT com, cygwin AT cygwin DOT com
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

David Byron:
> I suppose, but the point about \x18 not working with a character set that
> represents the desired codepoint wasn't clear. =C2=A0Nor was the bash syn=
tax for
> using \x in general. =C2=A0It's in the bash man page and not cygwin-speci=
fic, but
> an example showing the gory details would have helped me at least.

Hmm, it certainly looks like it managed to confuse you, but more
detail on the \x18 stuff might mislead more people into thinking they
have to use it to access non-ASCII filenames.


>> What do you mean by up-arrow? I'm getting a question mark, because
>> that's what ls prints for non-printable characters by default. You can
>> choose various quoting styles using the --quoting style option.
>
> I mean the uparrow that ls prints with --show-control-chars.

Ah, that's a Windows speciality, where the control chars have a second
life as graphical symbols:
http://blogs.msdn.com/michkap/archive/2005/02/26/381020.aspx.


>> > $ mkshortcut -n shortcut$'\xC3\xA9' plain; echo $?
>> > $ readshortcut shortcut$'\xE9'
>>
>> I'm afraid these aren't yet Unicode-ready, i.e. they still use Windows
>> "ANSI" APIs.
>
> Guess it's time to roll up my sleeves and write a patch.

That'd be great. Here's a starting point:
http://cygwin.com/ml/cygwin-apps/2010-01/msg00001.html

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019