Mail Archives: cygwin/2010/03/16/15:36:42
> > > > $ echo $LC_ALL
> > > > en_US
> > >
> > > Hang on, where did that come from?
It was in my environment. My apologies for being dense.
> > I unset LC_ALL and...
>=20
> Where?
I unset LC_ALL in bash, which was the wrong place.
> > Now ls foo<tab> adds the actual accented character to
> > the command line, but when I press return I get:
> >
> > ls: cannot access foo<a gray box>: No such file or directory
And of course this works now. Sorry for the trouble.
> > I still get the right answer from test -f, when using
> > the shell builtin. /usr/bin/test tells me the file
> > doesn't exist.
>=20
> .. and that.
As does this, as long as I use the same encoding I used to originally create
the file which is totally fine.
> > > The \x18 scheme is only used for codepoints that can
> > > not be represented in the selected character set, yet
> > > U+00E9 can be represented CP1252. By definition, any
> > > Unicode codepoint can be represented in UTF-8, so the
> > > \x18 scheme is never used when that is selected.
> > >
> > > To enable C-style backslash interpretation, you need
> > > to use $'...' quoting.
> >
> > I now see the bash man page explains this. =A0Must have
> > missed it the first time. =A0The above paragraphs with
> > some examples (where \x18 is needed and where it isn't)
> > added to
> >
http://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-unusual
> > would have gotten me farther before posting.
>=20
> But what I said is explained there already:
I suppose, but the point about \x18 not working with a character set that
represents the desired codepoint wasn't clear. Nor was the bash syntax for
using \x in general. It's in the bash man page and not cygwin-specific, but
an example showing the gory details would have helped me at least.
> > And finally here are the steps that illustrate what's going on.
> >
> > $ touch $'\x18'; echo $?
> > 0
> >
> > ls shows a file named up-arrow (0x18):
>=20
> What do you mean by up-arrow? I'm getting a question mark, because
> that's what ls prints for non-printable characters by default. You can
> choose various quoting styles using the --quoting style option.
I mean the uparrow that ls prints with --show-control-chars. Another
important omission on my part. Doh!
> Yep, but that's a bash vs ls issue rather than a Cygwin
> one. You'd get the same on Linux. But if you use control
> characters in filenames, you better know what you're doing
> anyway. Some argue that it shouldn't be allowed in the
> first place, e.g.
> http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html
Thanks for the link. I don't typically use control characters in filename.
Just an example.
> > $ mkshortcut -n shortcut$'\xC3\xA9' plain; echo $?
> > $ readshortcut shortcut$'\xE9'
>=20
> I'm afraid these aren't yet Unicode-ready, i.e. they still use Windows
> "ANSI" APIs.
Guess it's time to roll up my sleeves and write a patch.
-DB
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
- Raw text -