Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com From: "linda w \(cyg\)" To: Cc: Subject: File::Spec, Cygwin, Syntactic vs. Semantic path analysis is. Date: Fri, 3 Jan 2003 17:27:51 -0800 Message-ID: <000301c2b390$84d3a2c0$1403a8c0@sc.tlinx.org> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Priority: 3 (Normal) X-MSMail-Priority: Normal Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id h041ShS01097 A bit late to the party, I know, but wanted to chime in on the Cygwin File::Spec discussion. I'm 'cc'ing the cygwin list as a "heads up" for any interested parties. A more satisfactory mapping is to base "Cygwin" on Win32, not Unix. Cygwin, as an "OS interface" _partially_ supports posix mapping -- it supports posix naming to the same extent the underlying Win32 OS supports it -- not to the level Unix supports it. For example, 1) "\:" are illegal in Cygwin pathnames as they are under Win32. Posix doesn't have this difference. 2) By default, case doesn't matter under Cygwin in the cases where it doesn't matter under Win32. It's not _ignored_ -- case is preserved on filename creation but not lookup. Ex: law/proj/fspec> touch aBcDe law/proj/fspec/tmp> ls aBcDe (expected) law/proj/fspec/tmp> mv ABCDE AbCdE (this works!) law/proj/fspec/tmp> ls AbCdE (case is changed) law/proj/fspec/tmp> touch aBcDe (try to recreate original filename) law/proj/fspec/tmp> ls AbCdE (win32 behavior, not posix behavior) 3) using unix and win32 syntaxes to parse valid cygwin filenames "i:/fee/fie/foe/fum" and "i:fee/fie/foe/fum" yield (under 5.8): unix:v,d,f=, i:/fee/fie/foe/, fum unix: 'i:', 'fee', 'fie', 'foe', '' Win32:v,d,f=i:, /fee/fie/foe/, fum Win32:'', 'fee', 'fie', 'foe', '' and unix:v,d,f=, i:fee/fie/foe/, fum unix: 'i:fee', 'fie', 'foe', '' Win32:v,d,f=i:, fee/fie/foe/, fum Win32:'fee', 'fie', 'foe', '' Under cygwin, you cannot create a filename "i:fee" -- it creates a filename 'fee' (somewhere**) on the "i:" filesystem (volume). ==== => For some reason that eludes me, the current File::Spec implementation => returns a 'null' directory as the last directory component while no such => component existed in the original pathname. I would tend to think this is a => bug. Elucidating comments? Agreement? Disagreement? ==== Cygwin, and possibly, the Win32 module, are inconsistent in handling the differences between i:/foobar/ and i:. On one hand i: is considered a 'volume' but on the other hand i:/ seems to evaluate to the same, incorrect, value. In "Win32", each 'fs' of form ":', x of class <[:alpha:]>, there is a process-specific "current directory". This can be seen by: From cmd.exe, cygwin utils "ls", "pwd", "printenv" and "grep" are in my path. 1) Fresh cmd shell, what's on filesystem "Z"? C:\Documents and Settings\law>dir /b /a z:## /a=show hidden, /b=skip header desktop.ini Content.IE5 ## expected output, /a=show hidden files C:\Documents and Settings\law>ls z: Content.IE5 desktop.ini ## again, expected output... 2) Where are we? C:\Documents and Settings\law>cd C:\Documents and Settings\law ## expected -- shows same as the ## default prompt: C:\Documents and Settings\law>echo %prompt% $P$G ## $p=cur drive and path, $g=">" C:\Documents and Settings\law>pwd ## on cygwin? /cygdrive/c/Documents and Settings/law ## hmmm...cygwin translated c: to ## /cygdrive/c; seems like under ## cygwin, /cygdrive/c is a win32 volume. 3. What does File::Spec say about that? ishtar:law/proj/fspec> fspec "c:Documents and Settings/law/" Win32:v,d,f=c:, Documents and Settings/law/, Win32:'Documents and Settings', 'law', '' ## expected, C: is a fs (volume) cygwin: v,d,f=, c:Documents and Settings/law/, cygwin: 'c:Documents and Settings', 'law', '' ## oops, seems File::Spec::Cygwin ## doesn't act the same way the Cygwin ## layer does...>>not good<<, bug? 4. How about "/cygwin/c/Documents and Settings/law" ? cygwin: v,d,f=, /cygwin/c/Documents and Settings/law/, cygwin: '', 'cygwin', 'c', 'Documents and Settings', 'law', '' unix:v,d,f=, /cygwin/c/Documents and Settings/law/, unix: '', 'cygwin', 'c', 'Documents and Settings', 'law', '' Win32:v,d,f=, /cygwin/c/Documents and Settings/law/, Win32:'', 'cygwin', 'c', 'Documents and Settings', 'law', '' ## All the same...good, or is it?...um 5. ## under cygwin, /cygwin/x/ can't be used ## as a filename: law/proj/fspec> mkdir -p /cygdrive/e/foo mkdir: cannot create directory `/cygdrive/e': No such file or directory ## looks like /cygdrive/ is just ## as reserved as ':' -- it's ## a spec oops...under cygwin, /cygdrive/x/ ## always specifies a volume yet ## File::Spec doesn't seem to know ## about this distinction -- oh well, ## we can use rule if win32 elements ## present, 6. How about remote file systems? ishtar:law/proj/fspec> fspec '\\fee\fie\foe\fum' argv#=0 cygwin: v,d,f=, , \\fee\fie\foe\fum cygwin: unix:v,d,f=, , \\fee\fie\foe\fum unix: Win32:v,d,f=\\fee\fie, \foe\, fum Win32:'', 'foe', '' ## hmmm. cygwin thinks it is somehow ## different than the Win32 parsing. ## But is is not. As stated above, ## "\" isn't a valid filename character ## under cygwin. The correct parsing ## is the Win32 version -- since "\\" ## always starts a hostname under ## cygwin and win32. The "component" ## is the remote public share name. 7. What about 'other drives' C:\Documents and Settings\law>z: Z:\ ## lets goto z... Z:\>pwd ## prompt indicates Win32 curdir /cygdrive/z ## cygwin...consistent 8. Environmental factors...(back to 'c') C:\Documents and Settings\law>printenv |grep Z ## nothing in env for Z C:\Documents and Settings\law>cd z:Content.IE5 ## sets curdir in env: C:\Documents and Settings\law>printenv |grep Z !Z:=Z:\Content.IE5 ## it's there now... 9. Current dir check: C:\Documents and Settings\law>pwd ## win32 prompt /cygdrive/c/Documents and Settings/law ## cygwin -- expected 10: What's on Z again ... C:\Documents and Settings\law>dir /b /a z: index.dat desktop.ini 5OG8E8Y3 RRJT4N16 30F1BLRO UYYGJ85V ## Ahh...the Win32 separate curdir ## concept. C:\Documents and Settings\law>ls z: Content.IE5 desktop.ini ## uhoh, cygwin's ignoring the underlying ## OS concepts, this doesn't bode well 11. Lets switch to our 'Z' drive again' C:\Documents and Settings\law>z: Z:\Content.IE5> ## shows up win32's per-drive curdir ## concept Z:\Content.IE5>pwd /cygdrive/z/Content.IE5 ## pwd provides consistent feedback, 12. Commands: mkdir "x:c", x:", cd "x:c", " x:", assuming x is a drive for win32/cygwin examples: on Unix - 2 errors and ending up in dir "x:c" On Win, - no errors and same output -- no change in current directory on current drive. On cygwin - output of the of the two dirs is different and the current directory is changed. Cygwin isn't really behaving like Unix or Win32. So I see a couple of problems: 1) File::Spec should mainly be based on Win32, if not exactly the same. 2) Cygwin should pay attention to the Win32 concept of per-drive pathnames when win32 drive letters are used (though /cygdrive/c should still refer to the root dir of the 'c' drive). 3) File::Spec needs cleanup. Is it supposed to be parsing "Syntactically" or "Semantically". They are different. 4) File::Spec::Win32 would give incorrect results on one of my old Samba exported fs's: server exported people's home directories as "/home/" to prevent Win98 from automatically using \home\ as a location for a traveling profile. Syntactic parsing would yield a volume/fs of would yield //server/home and file= but the exported fs name was really "/home/user", and should have been parsed as one volume name: //server/home/user, or in DOS-syntax, "\\server\home/user". Syntactically, this can't be anticipated or interpreted and the use of a simple, documented limitation -- the assumption of non-intermixing of \ and / as pathname component separators in the same pathname would be used. So the first "/" sets the dir sep to "/" and "\" could signal a warning that the syntax is unclear. But a pathname with "\" as the first dir sep, would throw an illegal-filename exception if "/" was encountered, because in places where '\' is a dirsep, '/' is a switch character. Refering to problem 3, above, Semantically, under Unix as in other OS's, separate filesystems should be parsed out as separate volumes -- since the concept of a 'volume' in OS terminology is most often used to describe a filesystem. Older usage, I believe, used the term interchangeably with 'disk', but in modern usage (influenced most commonly by Win32), it is a file system. The Win32 module, semantically, also, for consistency, has to recognize the same problem in later NT-based OS's since 'volumes' can be mounted on arbitrary mount points in the fs-hierarchy. This underscores the concept of of a 'volume' as a mountable fs -- a definition that, semantically, would apply to Unix as well. But, vaguely, it appears the intent of File::Spec was to provide OS-blind ways of manipulating arbitrary filenames (not necessarily limited to valid filenames on the current system). As such, default manipulation routines should consistently be altered to be 'syntax-only' (except where absolutely necessary: ex. 'pwd' function). I'm not decided on the best syntax to provide an additional layer that does semantic analysis based on what is true on the current system. Comments? No flames, please...we don't need to get personal or religious on what should be an engineering discussion (in which I may have misconceptions, but may just be trying to look at the problem space from a different perspective. thanks, -linda -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/