X-Authentication-Warning: delorie.com: mail set sender to djgpp-bounces using -f X-Recipient: djgpp AT delorie DOT com Date: Thu, 04 Mar 2010 00:55:21 +0200 From: Eli Zaretskii Subject: Re: Bug in findfirst/findnext: mangles certain characters. In-reply-to: To: djgpp AT delorie DOT com Message-id: <83lje92exy.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: QUOTED-PRINTABLE X-012-Sender: halo1 AT inter DOT net DOT il References: <2PydnQe72P4H_BrWnZ2dnUVZ_vmdnZ2d AT giganews DOT com> Reply-To: djgpp AT delorie DOT com Errors-To: nobody AT delorie DOT com X-Mailing-List: djgpp AT delorie DOT com X-Unsubscribes-To: listserv AT delorie DOT com Precedence: bulk > From: "Rod Pemberton" > Date: Wed, 3 Mar 2010 16:02:51 -0500 > Bytes: 4067 >=20 > Apparently, "alt 0xxx" is an undocumented method to produce ISO8859= -1 in > Windows only. I think it's codepage 1252, not ISO 8859-1. But they are identical, except for the range 128..159 decimal. > ab=C2=A6=C3=8C.txt Name in Windows (a,b,0xA6,0xCC,.txt) > ab__.txt LFN in Windows98SE DOS console > AB__~1.txt SFN in Windows98SE DOS console Underscores is how Windows 9x translates characters it cannot express in the DOS OEM charset (codepage 437, in your case, I think). > The shorter name in Win98 *after* a reboot: > ab=C2=A6=C3=8C.txt >=20 > Still there in Windows!!! >=20 > So, the ISO8859-1 character information *is* being preserved. My g= uess is > it's probably in the LFN, just inaccessible from or converted when = in a DOS > or Windows DOS Console. Yes, that's true. > However, I thought ODI's ldir.exe does raw directory access (doesn'= t > use the LFN API but displays LFN's) and it displays *underscores*. The translation happens when the characters are displayed, not when the directory entry is accessed. > I'll have to startup a disk editor to see if the character > information is actually stored in the directory structure, and in > what format. On Windows 9x, I think you will find that directory entries include the characters from the codepage which was specified when Windows was installed. Windows 9x supports only a single 8-bit encoding, usually one of the 12XX codepages.