delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2002/07/02/17:53:00

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
X-WM-Posted-At: avacado.atomice.net; Tue, 2 Jul 02 22:50:32 +0100
Message-ID: <01f801c22212$7d0cecf0$0100a8c0@advent02>
From: "Chris January" <chris AT atomice DOT net>
To: <cygwin AT cygwin DOT com>, <v AT iki DOT fi>
References: <20020701085851 DOT GD9092 AT niksula DOT cs DOT hut DOT fi> <20020702213825 DOT GF9092 AT niksula DOT cs DOT hut DOT fi>
Subject: Re: Accessing filenames with different charsets
Date: Tue, 2 Jul 2002 22:50:31 +0100
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000

> > Sorry if this has already been discussed, but I couldn't find it in the
> > archive nor in the FAQ...
> >
> > If I have a file name with Russian characters in it, cygwin is unable to
> > access it:
> >
> > > ls
> > ????.TEST
> >
> > (Russian characters are shown as '?' in directory listing, but ls does
find
> > the file).
> >
> > If I try to access it, however, open fails:
> >
> > > touch *
> > touch: '????.TEST': no such file or directory
> >
> > same deal with less, cp, rm, rsync etc.
>
> Okay, it seems cygwin readdir() returns the filenames as "????.TEST"
(where
> ?:s are really ?:s (ascii 0x3f)). Looking at fhandler_disk_file.cc, this
> can't be caused by much else than by FindFirstFileA() returning
"????.TEST".
> And indeed, if made a little non-unicode test program, that called
> FindFirstFile, and it returned "????.TEST" ("\0x3f\0x3f\0x3f\0x3f.TEST").
>
> To access the file, the wide char versions of Find*File() functions would
> propably have to be used (or is there another way?). I can't no idea how
this
> could be integrated into the cygwin framework...
>
> Any ideas?
Qt (from Trolltech) encodes Unicode filenames before they are used. In
Cygwin we could do the reverse, i.e. use Find*FileW and then encode the
Unicode as a local ANSI string. If we do the encoding manually in Cygwin,
rather than let Windows do it for us, this would overcome the problem. I
will try to put together a patch for this that you can test. One possibility
is to encode Unicode strings as UTF-8.

Chris



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019