delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2010/03/03/17:55:25

X-Authentication-Warning: delorie.com: mail set sender to djgpp-bounces using -f
X-Recipient: djgpp AT delorie DOT com
Date: Thu, 04 Mar 2010 00:55:21 +0200
From: Eli Zaretskii <eliz AT gnu DOT org>
Subject: Re: Bug in findfirst/findnext: mangles certain characters.
In-reply-to: <hmmipg$4lv$1@speranza.aioe.org>
To: djgpp AT delorie DOT com
Message-id: <83lje92exy.fsf@gnu.org>
MIME-version: 1.0
X-012-Sender: halo1 AT inter DOT net DOT il
References: <2PydnQe72P4H_BrWnZ2dnUVZ_vmdnZ2d AT giganews DOT com> <hmbvg7$ieq$1 AT speranza DOT aioe DOT org> <Br-dnQ3aTcaMsRfWnZ2dnUVZ_uSdnZ2d AT giganews DOT com> <hmmipg$4lv$1 AT speranza DOT aioe DOT org>
Reply-To: djgpp AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

> From: "Rod Pemberton" <do_not_have AT havenone DOT cmm>
> Date: Wed, 3 Mar 2010 16:02:51 -0500
> Bytes: 4067
>=20
> Apparently, "alt 0xxx" is an undocumented method to produce ISO8859=
-1 in
> Windows only.

I think it's codepage 1252, not ISO 8859-1.  But they are identical,
except for the range 128..159 decimal.

> ab=C2=A6=C3=8C.txt    Name in Windows (a,b,0xA6,0xCC,.txt)
> ab__.txt    LFN in Windows98SE DOS console
> AB__~1.txt  SFN in Windows98SE DOS console

Underscores is how Windows 9x translates characters it cannot express
in the DOS OEM charset (codepage 437, in your case, I think).

> The shorter name in Win98 *after* a reboot:
> ab=C2=A6=C3=8C.txt
>=20
> Still there in Windows!!!
>=20
> So, the ISO8859-1 character information *is* being preserved.  My g=
uess is
> it's probably in the LFN, just inaccessible from or converted when =
in a DOS
> or Windows DOS Console.

Yes, that's true.

> However, I thought ODI's ldir.exe does raw directory access (doesn'=
t
> use the LFN API but displays LFN's) and it displays *underscores*.

The translation happens when the characters are displayed, not when
the directory entry is accessed.

> I'll have to startup a disk editor to see if the character
> information is actually stored in the directory structure, and in
> what format.

On Windows 9x, I think you will find that directory entries include
the characters from the codepage which was specified when Windows was
installed.  Windows 9x supports only a single 8-bit encoding, usually
one of the 12XX codepages.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019