delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/2001/10/14/17:00:58

From: sandmann AT clio DOT rice DOT edu (Charles Sandmann)
Message-Id: <10110142056.AA14936@clio.rice.edu>
Subject: Re: W2K/XP fncase
To: eliz AT is DOT elta DOT co DOT il
Date: Sun, 14 Oct 2001 15:56:11 -0500 (CDT)
Cc: djgpp-workers AT delorie DOT com
In-Reply-To: <7263-Sun14Oct2001200248+0200-eliz@is.elta.co.il> from "Eli Zaretskii" at Oct 14, 2001 08:02:49 PM
X-Mailer: ELM [version 2.5 PL2]
Mime-Version: 1.0
Reply-To: djgpp-workers AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp-workers AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

> Actually, the intended behavior is well defined, and does not rely on
> the interrupt.  It just uses the interrupt as a means to achieve a
> certain goal.
> What we really want is to be case-preserving and case-sensitive.

I understand the case issues on regular DOS.  I understand on pure lfn
implementations we wouldn't want or need fncase at all either.  So the
goal of the calling the _lfn_gen_short_name interrupt is to determine
if the name is a legal 8.3 dos name in all upper case.

Are you aware that _lfn_gen_short_name does *NOT* do this properly
even on W9x?  I can give examples of names I create under regular DOS
or with LFN=n, that _lfn_gen_short_name returns a different value 
than what is stored on the file system.  In particular, space is a
valid character under DOS, and so are many other 8 bit characters.
So we don't lower case any files which contain spaces (and many that
contain 8 bit chars) but legitimately we should.  Sure, the impact
here is negligible, it does the right thing 99.9% of the time.  But 
we can't point to the interrupt on W9x and say it's doing the right
thing either.  Thus my argument to replace it with something.

> Actually, on Windows 9X, it's very easy to predict whether the file
> will be downcased or not; in that respect, _lfn_gen_short_fname does
> its job quite well on W9X.  It's W2K and XP that introduced the
> problem.

I mean from a user point of view.  I can display two names, identical
except for a difference in one 8-bit character.  One will convert, one won't.
Both were created with lfn=n or dual booting into DOS.  

> My point was that we cannot just decide to always downcase on DOS; see
> the details above.  I agree that we don't need to create a string and
> compare it, but any other solution should not downcase unconditionally.

I wasn't clear - I never intended to always downcase on DOS.  I just 
plan to say if lfn=n and "fncase flag" then go directly to downcase, don't
bother generating any names or logic.  If there are cases that some DOS
returns mixed case and we want to keep it, then we can process it's names
too using the same algorithm.  

> By ``this issue'' I meant the whole question of whether and when to
> downcase.  I did not mean function 71A8h on which the solution was
> based: that one works quite reliably on Windows 9X.  The semi-broken
> handling of letter-case in file names on Windows is what makes our
> decisions tricky and prone to unintended consequences.

I don't want to change anything here except for when fncase=n and lfn
is active.  (I actually would have just patched lfn_gen ... )

My proposal is to change each occurance of:
 !strcmp(_lfn_gen_short_fname(longname, shortname), longname)
with a new function:
 _is_DOS83_upper(longname)
 
Which does not call any buggy Windows interrupts.  It would not consider
names with lower case or +,;=[] as DOS83, or any with multiple periods, or 
leading periods.  Length must be <=8 chars before a period or 12 chars total.

> > If there is no way, then maybe fncase=y should be default if lfn is
> > enabled for all platforms.
> 
> You mean, including Windows 9X?  We could consider making this change
> now, but how do we check it won't get users in trouble?  Windows 9X
> and ME are still the most popular systems.

Your original proposal was to just always force Win2K and XP into 
fncase=y, so I took this proposal to mean it was low risk.  If so, we
should do it on everything.  If not, we should fix fncase=n to be
consistent.

> I wasn't being personal (took a long, deep breath before writing that ;-), 
> it's just that you mentioned the performance issue several times, in a
> way that seemed to indicate that it's an important factor in this
> discussion.  If this isn't important, then let's simply put this
> aspect aside for a moment, okay?  After all, if the function were
> working on W2K/XP, we wouldn't even consider rewriting this code, right?

Correct.  It's a very minor factor.  If we do have to write something
minimizing interrupts is a low priority goal if there is a better way.

> > DH=1 breaks for essentially any non-Alpha character, so is a very
> > poor effort at fixing anything.
> 
> I agree; I wasn't aware of that when I wrote that suggestion.

It's even worse after more testing, it can't even get A.A right.
A.A -> A7919.A

> I'm not comfortable because I'm afraid to break things.  The
> letter-case handling is very fragile, so the more localized the change
> and more predictable its effect, the less risk we run.  Changes that
> touch all platforms and modify the resulting string (the one returned
> by _lfn_gen_short_fname) in non-trivial ways is something whose
> effects I have no way of knowing in advance.

I think I understand the 7 places we currently use it enough to believe
this is a relatively low risk change.  I have no idea if any external
code uses _lfn_gen_short_name for anything useful - if so they will
be badly broken today on W2K/XP until fixed.  

> Ideally, we should only downcase names of files which don't have an
> LFN entry.  But I'm afraid that this criterion doesn't have a
> practical solution.
> 
> Failing that, any file name that is valid on DOS should be downcased,
> unless we have a clear hint from the user that she doesn't want that,
> for example when we know that the upper-case part was actually typed
> by the user, not came from the OS.

Nothing I'm proposing would change any of that, except for the
implementation of guessing if it has an LFN entry.

> Hope this helps, and thanks again for working on this.

Yes, it helps.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019