delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2010/03/03/16:15:04

X-Authentication-Warning: delorie.com: mail set sender to djgpp-bounces using -f
From: "Rod Pemberton" <do_not_have AT havenone DOT cmm>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Bug in findfirst/findnext: mangles certain characters.
Date: Wed, 3 Mar 2010 16:02:51 -0500
Organization: Aioe.org NNTP Server
Lines: 99
Message-ID: <hmmipg$4lv$1@speranza.aioe.org>
References: <2PydnQe72P4H_BrWnZ2dnUVZ_vmdnZ2d AT giganews DOT com> <hmbvg7$ieq$1 AT speranza DOT aioe DOT org> <Br-dnQ3aTcaMsRfWnZ2dnUVZ_uSdnZ2d AT giganews DOT com>
NNTP-Posting-Host: pldq+kT97bAAp/ObDwnZyQ.user.speranza.aioe.org
X-Complaints-To: abuse AT aioe DOT org
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1983
X-Notice: Filtered by postfilter v. 0.8.2
X-Newsreader: Microsoft Outlook Express 6.00.2800.1983
X-Priority: 3
X-MSMail-Priority: Normal
Bytes: 4067
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

"Robbie Hatley" <see DOT my DOT signature AT for DOT my DOT contact DOT info> wrote in message
news:Br-dnQ3aTcaMsRfWnZ2dnUVZ_uSdnZ2d AT giganews DOT com...
>
> Also, you can type any iso-8859-1 characters you want like so:
> 1. Hold down the "Alt" key.
> 2. Type 0 (the zero key) followed by the 3-digit decimal
>    iso-8859-1 code for the glyph you want.
> 3. Release "Alt" key
> I do that to create pathological test files for my file-utility
> programs.  Like so:
>
> "Fíle-Nàme-Wíth-Måny-ïsõ-8859-1-Lëttêrs-Ìn-Ít.txt"
>

First, I think you should read Eli Zaretskii's reply.

Second, let's make sure I'm on the same page as you, then I'll tell you what
I've tried, and I'll try other stuff if you want.  I'm only familiar with
ASCII filenames.

Apparently, "alt 0xxx" is an undocumented method to produce ISO8859-1 in
Windows only.  It only works with the keypad numbers and Num-Lock must be
enabled.  But, "alt xxx" produces DOS characters in Windows, real DOS, and
Windows DOS Console. Real DOS and Windows DOS Console treat "all 0xxx" the
same as "alt xxx".

I'm using Windows98SE/MS-DOS v7.10 with KernelEx, not Win2kPro.

Windows98SE DOS console
alt 204 = 0x35 = 53
alt 0204 = 0x35 = 53

Windows98SE filename box
alt 204 = 0xA6 = 166
alt 0204 = 0xCC = 204

A SFN with 0xA6 and 0xCC:
ab¦Ì.txt    Name in Windows (a,b,0xA6,0xCC,.txt)
ab__.txt    LFN in Windows98SE DOS console
AB__~1.txt  SFN in Windows98SE DOS console

An LFN with 0xA6 and 0xCC:
abbbbbbbbbbbbbb¦Ì.txt    Win Name
abbbbbbbbbbbbbb__.txt    LFN
ABBBBB~1.txt             SFN

Since the LFN in the DOS console appeared to translate the two characters to
underscores, I rebooted to see if the alt-xxx/alt-0xxx character information
in Windows is preserved.

The shorter name in real DOS (MS-DOS v7.10):
ab__.txt    LFN in real DOS
AB__~1.txt  SFN in real DOS

Still underscores.

The shorter name in Win98 *after* a reboot:
ab¦Ì.txt

Still there in Windows!!!

So, the ISO8859-1 character information *is* being preserved.  My guess is
it's probably in the LFN, just inaccessible from or converted when in a DOS
or Windows DOS Console.  However, I thought ODI's ldir.exe does raw
directory access (doesn't use the LFN API but displays LFN's) and it
displays *underscores*.  So, that's interesting...  I'm not sure what is
going on exactly or where the info is stored.  I'll have to startup a disk
editor to see if the character information is actually stored in the
directory structure, and in what format.  That could take me a while to
find.

In order to find out if I could get the 0xA6 and 0xCC characters in a DOS
environment,

I tested these DOS LFN capable environments:
  1) Windows98SE DOS Console
  2) real DOS (MS-DOS v7.10) with DOSLFN 0.32o

I tried these methods:
  1) DJGPP findfirst/findnext loop
  2) DJGPP _dos_findfirst/_dos_findnext loop
  3) MS-DOS dir
  4) DJGPP's ls -l
  5) ODI's ldir

No luck.  I did not change the Code Page which is 437.

Both first/next loops were tested with DJGPP v2.03 and v2.04.  They return
the same LFN/SFN info as above. (no 0xA6 or 0xCC characters present -
replaced with underscores)


Rod Pemberton






- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019