X-Authentication-Warning: delorie.com: mail set sender to djgpp-bounces using -f From: "Rod Pemberton" Newsgroups: comp.os.msdos.djgpp Subject: Re: Bug in findfirst/findnext: mangles certain characters. Date: Wed, 3 Mar 2010 16:02:51 -0500 Organization: Aioe.org NNTP Server Lines: 99 Message-ID: References: <2PydnQe72P4H_BrWnZ2dnUVZ_vmdnZ2d AT giganews DOT com> NNTP-Posting-Host: pldq+kT97bAAp/ObDwnZyQ.user.speranza.aioe.org X-Complaints-To: abuse AT aioe DOT org X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1983 X-Notice: Filtered by postfilter v. 0.8.2 X-Newsreader: Microsoft Outlook Express 6.00.2800.1983 X-Priority: 3 X-MSMail-Priority: Normal Bytes: 4067 To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com "Robbie Hatley" wrote in message news:Br-dnQ3aTcaMsRfWnZ2dnUVZ_uSdnZ2d AT giganews DOT com... > > Also, you can type any iso-8859-1 characters you want like so: > 1. Hold down the "Alt" key. > 2. Type 0 (the zero key) followed by the 3-digit decimal > iso-8859-1 code for the glyph you want. > 3. Release "Alt" key > I do that to create pathological test files for my file-utility > programs. Like so: > > "Fíle-Nàme-Wíth-Måny-ïsõ-8859-1-Lëttêrs-Ìn-Ít.txt" > First, I think you should read Eli Zaretskii's reply. Second, let's make sure I'm on the same page as you, then I'll tell you what I've tried, and I'll try other stuff if you want. I'm only familiar with ASCII filenames. Apparently, "alt 0xxx" is an undocumented method to produce ISO8859-1 in Windows only. It only works with the keypad numbers and Num-Lock must be enabled. But, "alt xxx" produces DOS characters in Windows, real DOS, and Windows DOS Console. Real DOS and Windows DOS Console treat "all 0xxx" the same as "alt xxx". I'm using Windows98SE/MS-DOS v7.10 with KernelEx, not Win2kPro. Windows98SE DOS console alt 204 = 0x35 = 53 alt 0204 = 0x35 = 53 Windows98SE filename box alt 204 = 0xA6 = 166 alt 0204 = 0xCC = 204 A SFN with 0xA6 and 0xCC: ab¦Ì.txt Name in Windows (a,b,0xA6,0xCC,.txt) ab__.txt LFN in Windows98SE DOS console AB__~1.txt SFN in Windows98SE DOS console An LFN with 0xA6 and 0xCC: abbbbbbbbbbbbbb¦Ì.txt Win Name abbbbbbbbbbbbbb__.txt LFN ABBBBB~1.txt SFN Since the LFN in the DOS console appeared to translate the two characters to underscores, I rebooted to see if the alt-xxx/alt-0xxx character information in Windows is preserved. The shorter name in real DOS (MS-DOS v7.10): ab__.txt LFN in real DOS AB__~1.txt SFN in real DOS Still underscores. The shorter name in Win98 *after* a reboot: ab¦Ì.txt Still there in Windows!!! So, the ISO8859-1 character information *is* being preserved. My guess is it's probably in the LFN, just inaccessible from or converted when in a DOS or Windows DOS Console. However, I thought ODI's ldir.exe does raw directory access (doesn't use the LFN API but displays LFN's) and it displays *underscores*. So, that's interesting... I'm not sure what is going on exactly or where the info is stored. I'll have to startup a disk editor to see if the character information is actually stored in the directory structure, and in what format. That could take me a while to find. In order to find out if I could get the 0xA6 and 0xCC characters in a DOS environment, I tested these DOS LFN capable environments: 1) Windows98SE DOS Console 2) real DOS (MS-DOS v7.10) with DOSLFN 0.32o I tried these methods: 1) DJGPP findfirst/findnext loop 2) DJGPP _dos_findfirst/_dos_findnext loop 3) MS-DOS dir 4) DJGPP's ls -l 5) ODI's ldir No luck. I did not change the Code Page which is 437. Both first/next loops were tested with DJGPP v2.03 and v2.04. They return the same LFN/SFN info as above. (no 0xA6 or 0xCC characters present - replaced with underscores) Rod Pemberton