delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/04/01/17:33:18

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=1.5 required=5.0 tests=AWL,BAYES_00,BOTNET,J_CHICKENPOX_83
X-Spam-Check-By: sourceware.org
Message-id: <49D3EB8D.3040802@acm.org>
Date: Wed, 01 Apr 2009 15:32:45 -0700
From: David Rothenberger <daveroth AT acm DOT org>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.21) Gecko/20090302 Thunderbird/2.0.0.21 Mnenhy/0.7.6.666
MIME-version: 1.0
To: cygwin <cygwin AT cygwin DOT com>
Subject: [1.7] codepage:utf removal and python
X-IsSubscribed: yes
Reply-To: cygwin AT cygwin DOT com
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

I came across a problem today with Cygwin 1.7 while using rdiff-backup, 
which is a Python program. I have a directory with a file having a 
non-ASCII character in the name. rdiff-backup was unable to backup that 
directory.

When codepage:utf was supported, this worked fine. Now, it fails, even 
when I have LANG=en_US.UTF-8 in my environment. It all boils down to 
this python code:

   import os
   os.listdir('.')

(That's an example I run from within the directory.) This fails with an 
error

   OSError: [Errno 138] Invalid or incomplete multibyte or wide 
character: '.'

unless one does this first:

   import locale
   locale.setlocale(locale.LC_ALL, '')

I've patched rdiff-backup to do this, but I'm still wondering if this is 
the correct thing to do. I know that on my Linux machine, I don't have 
to do this, but I'm not sure if that's because there's some default 
locale that's being picked up by Python from somewhere other than the 
environment.

To sum it up: Is this a bad unintended consequence of removing codepage:utf?

-- 
David Rothenberger  ----  daveroth AT acm DOT org

Truthful, adj.:
         Dumb and illiterate.
                 -- Ambrose Bierce, "The Devil's Dictionary"


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019