X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org Date: Fri, 8 May 2009 18:04:15 +0200 From: Corinna Vinschen To: cygwin AT cygwin DOT com Subject: Re: [1.7][python] File operation API to multibyte filenames fails. Message-ID: <20090508160415.GM21324@calimero.vinschen.de> Reply-To: cygwin AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com References: <3f0ad08d0905080602s36a9eddg852eaa3ea3a2a69f AT mail DOT gmail DOT com> <20090508130901 DOT GL21324 AT calimero DOT vinschen DOT de> <3f0ad08d0905080621j2b1f97b9p317ee1df0f1dfc76 AT mail DOT gmail DOT com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3f0ad08d0905080621j2b1f97b9p317ee1df0f1dfc76@mail.gmail.com> User-Agent: Mutt/1.5.19 (2009-02-20) Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On May 8 22:21, IWAMURO Motonori wrote: > Hi. > > 2009/5/8 Corinna Vinschen : > > Your scripts.  Python correctly doesn't use setlocale because it's > > the responsibility of the application to set the local if it uses > > non-ASCII chars.  And Cygwin simply has no chance to convert UTF-8 > > to UTF-16 if the application doesn't ask for UTF-8. > > Oh, it is very very difficult. > Because ALL python utilities which access files or directories fail. > For example, Mercurial doesn't work. I can reproduce this issue and I created a simple application to create your example filenames in the current dir (see below). Given the python testcase import os os.listdir(".") can't see a fault in Cygwin. Neither from strace, nor in a GDB session. The readdir calls return the filenames using the SO sequences so that a valid byte-stream is created which also works in the C locale. However, for some reason there's a EILSEQ (138) errno generated, but from what I can tell it's not generated in Cygwin or newlib code. So I'd like to ask Jason, our python maintainer, to have a look into that. Maybe we just need a python rebuild for 1.7? Corinna This is the simple code I used to create the japanese filenames: #include #include int main () { char file1[] = { 0xe3, 0x82, 0xb9, 0xe3, 0x82, 0xbf, 0xe3, 0x83, 0xbc, 0xe3, 0x83, 0x88, 0xe3, 0x83, 0xa1, 0xe3, 0x83, 0x8b, 0xe3, 0x83, 0xa5, 0xe3, 0x83, 0xbc, 0 }; char file2[] = { 0xe3, 0x83, 0x87, 0xe3, 0x82, 0xb9, 0xe3, 0x82, 0xaf, 0xe3, 0x83, 0x88, 0xe3, 0x83, 0x83, 0xe3, 0x83, 0x97, 0 }; setlocale (LC_ALL, "en_US.UTF-8"); int fd = open (file1, O_CREAT|O_RDWR, 0644); close (fd); fd = open (file2, O_CREAT|O_RDWR, 0644); close (fd); return 0; } -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/