delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2003/01/03/20:28:44

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
From: "linda w \(cyg\)" <cygwin AT tlinx DOT org>
To: <module-authors AT perl DOT org>
Cc: <cygwin AT cygwin DOT com>
Subject: File::Spec, Cygwin, Syntactic vs. Semantic path analysis is.
Date: Fri, 3 Jan 2003 17:27:51 -0800
Message-ID: <000301c2b390$84d3a2c0$1403a8c0@sc.tlinx.org>
MIME-Version: 1.0
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id h041ShS01097

A bit late to the party, I know, but wanted to chime in on the Cygwin
File::Spec discussion.  I'm 'cc'ing the cygwin list as a "heads up" for any
interested parties.

A more satisfactory mapping is to base "Cygwin" on Win32, not Unix.

Cygwin, as an "OS interface" _partially_ supports posix mapping -- it supports
posix naming to the same extent the underlying Win32 OS supports it -- not
to the level Unix supports it.

For example, 
1) "\:" are illegal in Cygwin pathnames as they are under Win32.  Posix doesn't have this difference.

2) By default, case doesn't matter under Cygwin in the cases where it doesn't
matter under Win32.  It's not _ignored_ -- case is preserved on filename
creation but not lookup.  Ex:

law/proj/fspec> touch aBcDe
law/proj/fspec/tmp> ls
aBcDe							(expected)
law/proj/fspec/tmp> mv ABCDE AbCdE		(this works!)
law/proj/fspec/tmp> ls
AbCdE							(case is changed)
law/proj/fspec/tmp> touch aBcDe		(try to recreate original filename)
law/proj/fspec/tmp> ls
AbCdE							(win32 behavior, not posix behavior)

3) using unix and win32 syntaxes to parse valid cygwin filenames "i:/fee/fie/foe/fum" and "i:fee/fie/foe/fum" yield (under 5.8):

unix:v,d,f=, i:/fee/fie/foe/, fum
unix: 'i:', 'fee', 'fie', 'foe', ''
Win32:v,d,f=i:, /fee/fie/foe/, fum
Win32:'', 'fee', 'fie', 'foe', ''
    and
unix:v,d,f=, i:fee/fie/foe/, fum
unix: 'i:fee', 'fie', 'foe', ''
Win32:v,d,f=i:, fee/fie/foe/, fum
Win32:'fee', 'fie', 'foe', ''

	Under cygwin, you cannot create a filename "i:fee" -- it creates a filename
'fee' (somewhere**) on the "i:" filesystem (volume). 

====
=>	For some reason that eludes me, the current File::Spec implementation
=> returns a 'null' directory as the last directory component while no such
=> component existed in the original pathname.  I would tend to think this is a
=> bug. Elucidating comments?  Agreement?  Disagreement?
====

	Cygwin, and possibly, the Win32 module, are inconsistent in handling the differences between i:/foobar/ and i:.  On one hand i: is considered a 'volume'
but on the other hand i:/ seems to evaluate to the same, incorrect, value.
In "Win32", each 'fs' of form "<x>:', x of class <[:alpha:]>, there is a process-specific "current directory".  This can be seen by:

From cmd.exe, cygwin utils "ls", "pwd", "printenv" and "grep" are in my path.

1) Fresh cmd shell, what's on filesystem "Z"?

C:\Documents and Settings\law>dir /b /a z:## /a=show hidden, /b=skip header
desktop.ini
Content.IE5						## expected output, /a=show hidden files
							
C:\Documents and Settings\law>ls z:
Content.IE5  desktop.ini			## again, expected output...
							
2) Where are we?

C:\Documents and Settings\law>cd
C:\Documents and Settings\law			## expected -- shows same as the
							## default prompt:
C:\Documents and Settings\law>echo %prompt%
$P$G							## $p=cur drive and path, $g=">"
C:\Documents and Settings\law>pwd		## on cygwin?
/cygdrive/c/Documents and Settings/law	## hmmm...cygwin translated c: to
							## /cygdrive/c;  seems like under
							## cygwin, /cygdrive/c is a win32 volume.
3. What does File::Spec say about that?
ishtar:law/proj/fspec> fspec "c:Documents and Settings/law/"
Win32:v,d,f=c:, Documents and Settings/law/, 
Win32:'Documents and Settings', 'law', ''	## expected, C: is a fs (volume)
cygwin: v,d,f=, c:Documents and Settings/law/, 
cygwin: 'c:Documents and Settings', 'law', ''
							## oops, seems File::Spec::Cygwin
							## doesn't act the same way the Cygwin
							## layer does...>>not good<<, bug?

4. How about "/cygwin/c/Documents and Settings/law" ?
cygwin: v,d,f=, /cygwin/c/Documents and Settings/law/, 
cygwin: '', 'cygwin', 'c', 'Documents and Settings', 'law', ''
unix:v,d,f=, /cygwin/c/Documents and Settings/law/, 
unix: '', 'cygwin', 'c', 'Documents and Settings', 'law', ''
Win32:v,d,f=, /cygwin/c/Documents and Settings/law/, 
Win32:'', 'cygwin', 'c', 'Documents and Settings', 'law', ''

							## All the same...good, or is it?...um
5.
							## under cygwin, /cygwin/x/ can't be used
							## as a filename:
law/proj/fspec> mkdir -p /cygdrive/e/foo
mkdir: cannot create directory `/cygdrive/e': No such file or directory
							## looks like /cygdrive/<x> is just
							## as reserved as '<x>:' -- it's
							## a <fs> spec

oops...under cygwin, /cygdrive/x/
							## always specifies a volume yet
							## File::Spec doesn't seem to know
							## about this distinction -- oh well,
							## we can use rule if win32 elements
							## present, 
6. How about remote file systems?
ishtar:law/proj/fspec> fspec '\\fee\fie\foe\fum'
argv#=0
cygwin: v,d,f=, , \\fee\fie\foe\fum
cygwin: 
unix:v,d,f=, , \\fee\fie\foe\fum
unix: 
Win32:v,d,f=\\fee\fie, \foe\, fum
Win32:'', 'foe', ''
							## hmmm. cygwin thinks it is somehow
							## different than the Win32 parsing.
							## But is is not.  As stated above,
							## "\" isn't a valid filename character
							## under cygwin.  The correct parsing
							## is the Win32 version -- since "\\"
							## always starts a hostname under
							## cygwin and win32.  The "component"
							## is the remote public share name.

7. What about 'other drives'							
C:\Documents and Settings\law>z:
Z:\							## lets goto z...
Z:\>pwd						## prompt indicates Win32 curdir
/cygdrive/z						## cygwin...consistent

8. Environmental factors...(back to 'c')
C:\Documents and Settings\law>printenv |grep Z		## nothing in env for Z
C:\Documents and Settings\law>cd z:Content.IE5
							## sets curdir in env:
C:\Documents and Settings\law>printenv |grep Z
!Z:=Z:\Content.IE5				## it's there now...

9. Current dir check:
C:\Documents and Settings\law>pwd		## win32 prompt
/cygdrive/c/Documents and Settings/law	## cygwin -- expected

10: What's on Z again ...
C:\Documents and Settings\law>dir /b /a z:
index.dat
desktop.ini
5OG8E8Y3
RRJT4N16
30F1BLRO
UYYGJ85V						## Ahh...the Win32 separate curdir
							## concept.

C:\Documents and Settings\law>ls z:
Content.IE5  desktop.ini			## uhoh, cygwin's ignoring the underlying							## OS concepts, this doesn't bode well
11. Lets switch to our 'Z' drive again'
C:\Documents and Settings\law>z:
Z:\Content.IE5>					## shows up win32's per-drive curdir
							## concept
Z:\Content.IE5>pwd
/cygdrive/z/Content.IE5				## pwd provides consistent feedback,

12. Commands: mkdir "x:c", <ls/dir> x:", cd "x:c", "<ls/dir> x:", 
	assuming x is a drive for win32/cygwin examples:
	on Unix - 2 errors and ending up in dir "x:c"
	On Win, - no errors and same output -- no change in current directory 
on current drive.
	On cygwin - output of the of the two dirs is different and the current
directory is changed.

	Cygwin isn't really behaving like Unix or Win32.

	So I see a couple of problems: 
1) File::Spec should mainly be based on Win32, if not exactly the same.
2) Cygwin should pay attention to the Win32 concept of per-drive pathnames
	when win32 drive letters are used (though /cygdrive/c should still refer to
the root dir of the 'c' drive).
3) File::Spec needs cleanup.  Is it supposed to be parsing "Syntactically" or
"Semantically".  They are different.
4) File::Spec::Win32 would give incorrect results on one of my old Samba
exported fs's: server exported people's home directories as "/home/<user>"
to prevent Win98 from automatically using \home\<user> as a location for
a traveling profile.

Syntactic parsing would yield a volume/fs of would yield //server/home and 
file=<user> but the exported fs name was really "/home/user", and should have
been parsed as one volume name: //server/home/user, or in DOS-syntax, "\\server\home/user".

Syntactically, this can't be anticipated or interpreted and the use of a simple,
documented limitation -- the assumption of non-intermixing of \ and / as
pathname component separators in the same pathname would be used.  So the first
"/" sets the dir sep to "/" and "\" could signal a warning that the syntax is
unclear.  But a pathname with "\" as the first dir sep, would throw an illegal-filename exception if "/" was encountered, because in places where '\' is a 
dirsep, '/' is a switch character.

Refering to problem 3, above, Semantically, under Unix as in other OS's, 
separate filesystems should be parsed out as separate volumes -- since the concept of a 'volume' in OS terminology is most often used to describe a filesystem.  Older usage, I believe, used the term interchangeably with 'disk', but in modern usage (influenced most commonly by Win32), it is a file system.

The Win32 module, semantically, also, for consistency, has to recognize the
same problem in later NT-based OS's since 'volumes' can be mounted on arbitrary
mount points in the fs-hierarchy.  This underscores the concept of of a
'volume' as a mountable fs -- a definition that, semantically, would apply to
Unix as well.  But, vaguely, it appears the intent of File::Spec was to provide
OS-blind ways of manipulating arbitrary filenames (not necessarily limited to
valid filenames on the current system).  As such, default manipulation routines
should consistently be altered to be 'syntax-only' (except where absolutely
necessary: ex. 'pwd' function).

I'm not decided on the best syntax to provide an additional layer that does
semantic analysis based on what is true on the current system.

Comments?  No flames, please...we don't need to get personal or religious on
what should be an engineering discussion (in which I may have misconceptions,
but may just be trying to look at the problem space from a different perspective.

thanks,
-linda


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019