Mailing-List: contact cygwin-help AT sourceware DOT cygnus DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT sources DOT redhat DOT com Delivered-To: mailing list cygwin AT sources DOT redhat DOT com Message-Id: <4.3.2.7.0.20001023204853.00cfe100@pop.bresnanlink.net> X-Sender: cabbey AT pop DOT bresnanlink DOT net X-Mailer: QUALCOMM Windows Eudora Version 4.3.2 Date: Mon, 23 Oct 2000 23:32:01 -0500 To: From: Chris Abbey Subject: RE: non latin file names? In-Reply-To: <000c01c03cbd$ca259500$21c9ca95@mow.siemens.ru> References: <4 DOT 3 DOT 2 DOT 7 DOT 0 DOT 20001020221955 DOT 00b019c0 AT pop DOT bresnanlink DOT net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed At 10:52 10/23/00 +0400, Andrej Borsenkow wrote: >It is not a "cygwin" - it is `ls' command, that replaces non-printable >characters with `?'. very true... ls asks the libc to render an 8bit character, and libc doesn't know how to convert this code point so it punts with the universal substitution char... like I said, I understand that part... it was why md5sums couldn't find a given file name out of the sums file that was confusing me. > Which file, sorry? File, that you have copied from NT to Lnux (onto >SAMBA-exported partition)? both. ;) the sums file from linux contains a filename that md5sums on nt can't locate... and vice versa. That's how I originally got into this, was cross checking files from a very hacked together solution for moving a HUGE amount of data... it involved dd, bzip2, plip, and bsd pipes... very ugly and we wanted to sanity check things after the fact. By now I've compared the md5sums by hand... ugh :p > O.K., you have (probably) two distinct problems here: >Problem 1 - SAMBA and 8-bit characters. > >You must tell SAMBA what OEM code page is used by your client. This is >probably either 850 or 437. You better ask on SAMBA list about this problem. most likely this is the problem, I've done some more hacking (trying to answer your previous questions with a step by step demo you could run yourself to do a bare minimum recreation) and I see that whatever error is happening on the xfr from windows to linux via samba it's orthogonal... gigo... such that even though the filename becomes garbage on linux, it's at least consistently able to spit it back to me correctly on NT. I'll follow up with a samba guru I know at work. The root problem in this case was that, while the two files appeared to have the same file name, with the same glyphs, at a binary level they didn't match... cygwin on nt wanted to use 0xF3, but the file that came over from linux had 0xA2. >Problem 2 - locale support in Cygwin > >Cygwin does not have any locale support at all. There is stub implementation >for setlocale that basically sets locale to C. Two possible implementations >are: > >- use own locale database (basically, reimplement standard glibc locale >support) >- rely on Windows locale support if possible. > >I prefer the second. I'd vote for the first, namely a port of gconv... the GNU impl of iconv... clean and posix compliant as opposed to whatever m$ came up with... not that I'm too familiar with m$'s "solution", just making a prediction based on past experience. (iconv otoh I am familiar with... it rocks.) >Of course, either needs somebody to implement :-) oh so much code to write... so little time... what joy it is to be a geek. :> -- Want to unsubscribe from this list? Send a message to cygwin-unsubscribe AT sourceware DOT cygnus DOT com