Mail Archives: cygwin/2009/04/02/03:41:01
On Apr 1 15:32, David Rothenberger wrote:
> When codepage:utf was supported, this worked fine. Now, it fails, even
> when I have LANG=en_US.UTF-8 in my environment. It all boils down to
> this python code:
>
> import os
> os.listdir('.')
>
> (That's an example I run from within the directory.) This fails with an
> error
>
> OSError: [Errno 138] Invalid or incomplete multibyte or wide
> character: '.'
>
> unless one does this first:
>
> import locale
> locale.setlocale(locale.LC_ALL, '')
That's always the better approach, otherwise the application works
in the C locale.
> I've patched rdiff-backup to do this, but I'm still wondering if this is
> the correct thing to do. I know that on my Linux machine, I don't have
> to do this, but I'm not sure if that's because there's some default
> locale that's being picked up by Python from somewhere other than the
> environment.
The basic problem is that Windows stores filenames in UTF-16 while Linux
and other OSes store the filename as a simple, zero-terminated
bytestream. A simple bytestream is always valid. OTOH, a UTF-16 to
singlebyte conversion has always characters which can't be converted.
To workaround that I created the filename conversion method explained in
http://cygwin.com/1.7/cygwin-ug-net/using-specialnames.html#pathnames-unusual
I'm not sure why this doesn't work in your simple case. The locale is C
because the application didn't use setlocale. The resulting charset is
ASCII. The filename should have been converted to use the ASCII SO/UTF-8
sequence for the non-readable characters.
[...time passes...]
And it works as designed in your above testcase.
I tested with a filename containing a Euro sign (Unicode 0x20ac), in
HTML speak "qq€". Cygwin converted it to "qq\016\342\202\254"
The strace looks perfectly normal. I have no idea what python complains
about!
Jason, can you shed some light on this problem?
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -