delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2010/02/27/16:00:23

X-Authentication-Warning: delorie.com: mail set sender to djgpp-bounces using -f
From: Rugxulo <rugxulo AT gmail DOT com>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: Bug in findfirst/findnext: mangles certain characters.
Date: Fri, 26 Feb 2010 17:27:47 -0800 (PST)
Organization: http://groups.google.com
Lines: 92
Message-ID: <5099c66a-fad4-42b6-8fb0-aaae2f01d35e@19g2000yqu.googlegroups.com>
References: <2PydnQe72P4H_BrWnZ2dnUVZ_vmdnZ2d AT giganews DOT com>
NNTP-Posting-Host: 65.13.115.246
Mime-Version: 1.0
X-Trace: posting.google.com 1267234068 16428 127.0.0.1 (27 Feb 2010 01:27:48 GMT)
X-Complaints-To: groups-abuse AT google DOT com
NNTP-Posting-Date: Sat, 27 Feb 2010 01:27:48 +0000 (UTC)
Complaints-To: groups-abuse AT google DOT com
Injection-Info: 19g2000yqu.googlegroups.com; posting-host=65.13.115.246;
posting-account=p5rsXQoAAAB8KPnVlgg9E_vlm2dvVhfO
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.7)
Gecko/20091221 Firefox/3.5.7 (.NET CLR 3.5.30729),gzip(gfe),gzip(gfe)
Bytes: 4989
X-Original-Bytes: 4946
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Hi,

On Feb 25, 11:54=A0pm, "Robbie Hatley"
<see DOT my DOT signat DOT  DOT  DOT  AT for DOT my DOT contact DOT info> wrote:
>
> I've noticed that whenever I write programs using djgpp which
> rename files, if they encounter files with certain characters
> in their names, rename attempts fail, because findfirst()
> and findnext() change the numeric value of those characters,
> apparently in an attempt to re-map them to some other
> character encoding.
>
> Some extended-ASCII characters DO get through unmolested.
> But most non-ASCII character get re-mapped.
> (snip)
> Those are all legal characters in both iso-8859-1 and
> in Windows long file names.

DJGPP only properly supports "C" locale, e.g. 7-bit ASCII. Anything
extra isn't available. For pure DOS, you can try the third-party
llocl102b.zip library, but even it may not work (haven't tested it
much myself) and needs COUNTRY.SYS + DISPLAY + EGA?.CPI + KEYB or
similar. (Henrique Peron of FreeDOS is the resident expert in this
area, FYI, if you really really need help.)

http://djgpp.cybermirror.org/current/v2tk/llocl02b.zip
http://djgpp.cybermirror.org/current/v2tk/llocl02s.zip

BTW, what Windows are you using? I'll guess XP. Anyways, I guess you
know XP (even with FAT partitions?) uses UTF-16. So there is no
Latin-1 there (nor was there any in Win9x either, cp850 is just an
altered variant with most of the same glyphs).

http://www.kostis.net/en/index.htm
http://www.kostis.net/freeware/isocp101.zip

isocp101.zip  	V1.01
1993-12-19 	ISO 8859-x code pages for MS-DOS

> BUT, for some reason,
> findfirst() and findnext() convert them to other characters.
> It looks to me like these functions are trying to convert
> characters they don't like into characters with similar-looking
> glyphs in some other encoding.
>
> This is broken, because it causes rename attempts to fail,
> because no files actually exist with the altered versions
> of their names given by findfirst/findnext.

So this is a problem of findfirst / findnext or of rename or both?
Does a simple findfirst / findnext app (e.g. ls.exe) report the names
correctly? (Using iconv???)

> I'm curious if anyone has run across this bug before?

Probably not English-only Americans like me. I've (very very) briefly
dabbled in codepages "for fun" (Latin-3 ftw!), but nothing
hardcore.    ;-)

> And has this been fixed in recent versions? =A0(I'm using djgpp's
> gcc version "4.2.3", so i'm about 3 versions behind the latest.)

http://gcc.gnu.org/releases.html

GCC 4.2.3 	February 1, 2008

That's not really old, IMHO. Besides, it's not GCC proper's fault, per
se, it's our libc (e.g. DJGPP) or OS or both or ....

> If it's not been fixed, I suggest it should be put on the list of
> "bugs to fix in next release".

Cygwin (1.7) only recently gained full use of Unicode by dropping
Win9x support (bleh). And DJGPP is not Cygwin. The problem may indeed
lie with Windows (NTVDM limitation?). As mentioned, pure DOS is a
whole other ball of wax.

> In the mean time, anyone know of any workarounds for this?
> Some way to turn off the "character re-mapping" which
> findfirst and findnext are doing, and force them to retain
> the original numeric value of each character?

Does a simple "ren blah blah2" at the shell work? Bash? 4DOS? WinXP
CMD or command.com? FreeCOM? You'll have to test some things to see
what to expect, what works, etc.

P.S. The best (only??) DJGPP program to really support i18n features
is the text editor Mined (just released 2000.16). It probably has some
stuff in there that you would find useful. Give it a whirl in addition
to trying some of the above-mentioned stuff for completeness.

http://www.towo.net/mined/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019