delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2009/10/31/15:14:36

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00
X-Spam-Check-By: sourceware.org
X-Cloudmark-SP-Filtered: true
X-Cloudmark-SP-Result: v=1.0 c=1 a=A31HXRqcgHEA:10 a=kCKDY91tEBMc+hi4YtGk8Q==:17 a=w_pzkKWiAAAA:8 a=CQdom_iH54NBBsnm0FwA:9 a=LibE2_C1M_F72YZsJ85ObEV1bSkA:4
Message-ID: <4AEC9A9F.6080201@monai.ca>
Date: Sat, 31 Oct 2009 13:14:23 -0700
From: Steven Monai <steve+cygwin AT monai DOT ca>
User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: SOLVED: Removed 1.5.25 and installed 1.7.0, but still cannot access filenames containing Unicode
References: <4AE65A40 DOT 9070405 AT monai DOT ca> <20091027091532 DOT GB2076 AT calimero DOT vinschen DOT de> <4AE8A278 DOT 5060406 AT monai DOT ca> <20091029110030 DOT GN28753 AT calimero DOT vinschen DOT de> <4AE9F26B DOT 4030203 AT monai DOT ca>
In-Reply-To: <4AE9F26B.4030203@monai.ca>
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

Corinna Vinschen wrote:
>>    22   12422 [main] ls 1300 fhandler_disk_file::readdir_helper: wchar filename: "Mikey12\xf020.ai"
> 
> That's the problem.  The character in that file is *not* U+0323, but
> U+f020, a character in the Unicode private use range, which is used in
> Cygwin to map ASCII characters invalid in Windows filenames but valid
> in POSIX filenames.  It's also used to map multibyte characters > 0x80
> which are invalid in the current charset.  

Thanks for diagnosing my problem. My assumption that the char was U+0323
was based on scrolling through the Windows Character Map app for
something that visually matched what I was seeing in the filename. Not
having any other way to quantify the char, I jumped to an incorrect
conclusion.

This also explains why my test on another machine produced the correct
result. Instead of actually copying the troublesome file over to that
machine, I lazily created a filename there containing the char (U+0323)
I presumed to be causing trouble on the first machine. I have since
verified that the char U+0323 works fine in filenames on all my Cygwin
1.7 boxes, while the char U+f020 fails on all of them.

> You must not use characters
> in this range from U+f000 up to U+f0ff.  There's no solution to this
> except for "don't use these characters in filenames if they are not
> explicitely written there by either Cygwin or Microsoft's SUA".

The above two sentences should probably go into the UG.

> See http://cygwin.com/1.7/cygwin-ug-net/using-specialnames.html#pathnames-specialchars

That section of the UG only says that certain special chars are mapped
by Cygwin into the 0xf000 to 0xf0ff range. It does not explicitly say
that Cygwin may not be able to deal with filenames containing arbitrary
chars from that range, hence my suggestion that the UG be updated.

Thanks again,
-SM
--


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019