delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2001/10/13/13:49:49

Date: Sat, 13 Oct 2001 19:41:52 +0200
From: "Eli Zaretskii" <eliz AT is DOT elta DOT co DOT il>
Sender: halo1 AT zahav DOT net DOT il
To: sandmann AT clio DOT rice DOT edu
Message-Id: <2957-Sat13Oct2001194151+0200-eliz@is.elta.co.il>
X-Mailer: Emacs 20.6 (via feedmail 8.3.emacs20_6 I) and Blat ver 1.8.9
CC: djgpp-workers AT delorie DOT com
In-reply-to: <10110131551.AA12493@clio.rice.edu> (sandmann@clio.rice.edu)
Subject: Re: W2K/XP fncase [was Re: New perl package]
References: <10110131551 DOT AA12493 AT clio DOT rice DOT edu>
Reply-To: djgpp-workers AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp-workers AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

> From: sandmann AT clio DOT rice DOT edu (Charles Sandmann)
> Date: Sat, 13 Oct 2001 10:51:08 -0500 (CDT)
> > 
> > No, _lfn_gen_short_fname is the direct interface to the Windows
> > interrupt.  It does not exist merely to downcase file names when we
> > think we should.
> 
> That's not what it does - there is no interrupt for this in DOS, and
> _lfn_gen_short_name has code there that converts the string to upper 
> case and truncates it to 12 characters.  If this is really a wrapper
> for the Windows interrupt it should either fail on regular DOS or
> return the string unchanged.

It simply does on DOS the equivalent of what Windows would have done
in that case.  We do similar things in other functions, e.g.,
_get_volume_info.

> It also appears that in each of the 7 places this appears in the 
> library it is part of a strcmp with the long name - many of which are 
> not directly fncase related.  Even more interesting is that in none of
> those 7 places is the short name returned used at all except in
> the string comparison.

All true, but the function is also meant to be used by applications.
Do we really want to go out and check that none does?  Why waste our
time?  It's well known that once you provide an external function,
there's no way back--the genie is out of the bottle for good.

We _could_ replace _lfn_gen_short_name's body by an equivalent code,
but that's not the case here.

> So this function is not actually used
> anywhere in the library and each of these 7 places could be replaced
> by an even simpler copy of what I provided - which just returns a
> true or false flag if any characters would be changed.

Whether we do or don't replace the code which calls
_lfn_gen_short_name in the library is a separate matter.  What I was
arguing in this part was that the new code cannot be called
_lfn_gen_short_name because it isn't equivalent to what
_lfn_gen_short_name does now.

> For example, if lfn=n we should always lower case
> the names (a very simple test) instead of needing to generate a 
> string we strcmp with, throw away and then duplicate this behavior.

That would preclude a possibility to see file names on DOS in their
original UPPER case; for example, try "djecho [A-Z]*" on plain DOS.
IIRC, some package (Groff?) depends on that for its build procedure.

> > I'm still puzzled why a global non-trivial change is deemed better
> > than something localized to a specific OS in an otherwise proven
> > function.  The current support for LFN-related features took several
> > releases to get right; do we really want to put that at jeopardy for
> > the sake of saving a few cycles?  
> 
> No, but since it is unreliable, is used in 7 different places, we need
> some way to fix this.  I'd like something consistent between the 
> operating systems for something as simple (and relatively unimportant)
> as what case short file names are returned in.

I agree with the goal; the argument is about the way to achieve that
goal.

This issue is full of hidden gotchas and unintended consequences,
because Microsoft's implementation of case-preservation is
semi-broken, haphazard, and sometimes downright nonsensical.  I have
scars from fine-tuning these issues all over my heart, and I'm too old
to see it (my heart) broken again.  We don't even have a test suite
that is extensive enough to test the effect of such changes, so most
probably we won't know until it's too late.

All I want is that we don't break what took so long to get right.

So maybe the code I wrote is wasteful.  I understand that it might
bug you to see a function which issues an RM interrupt, and whose
output is used inefficiently, or even not used at all.  But it works;
it was proven by two years of intensive use; and it certainly isn't a
bottleneck in any real-life application.

Therefore, my suggestion is: let's make a local change in
_lfn_gen_short_name so that it calls 71A8h with DH=1 on W2K and XP.
(We should see that this doesn't break NT with the LFN TSR.)  The file
names which come bogus as the result are very rare, and when they do
happen all that we'll see is that the file name is not downcased when
it should have been--not a big deal IMHO.

If we really want to get fancy, we could try to repair the result that
W2K returns.  But I'm not suggesting that: if the underlying OS call
is buggy, it is perfectly okay to return the messed up name it gives
us.

However, you are doing the work, so eventually it's your call.  If you
want to introduce a new function with the body you sent a while ago,
and rewrite the other library functions to call it instead of
_lfn_gen_short_name, feel free to go ahead and do it.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019