delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2005/03/05/20:48:33

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
X-Sender: outgoing mail@ (Unverified)
Date: Sun, 06 Mar 2005 02:48:14 +0100
To: cygwin AT cygwin DOT com
From: Arend-Jan Westhoff <jpmcyafvmhsl AT spammotel DOT com>
Subject: Re: Bug diff 2.8.7: Separate dir
Mime-Version: 1.0
Message-Id: <20050306014821.3C2D721016A@warserver.warande.net>

Thanks for the explanation. However I don't quite understand this is what one
would want. 

With regard to paths I would expect one to want:
A Windows or Posix style path is converted to one internal path format.
After this conversion the behaviour is independent of whatever the
original format was.
In my opinion diff clearly violates this behaviour.
Also on reading the User's manual chapter "Mapping path names" I get
the idea that the behaviour I would want is described there. I quote:
	Cygwin supports both Win32- and POSIX-style paths, where directory 
	delimiters may be either forward or back slashes.
	..., Cygwin maintains a special internal POSIX view of the Win32 file 
	system that allows these programs to successfully run under Windows. 
	Cygwin uses this mapping to translate between Win32 and POSIX 
	paths as necessary.
It also seems inconsequent if what you say is truely correct and what is
intended that when I use my file 'a' from my original example and do the 
following:
	copy a b
that then:
	diff ./a .\b
says that the files are completely different, whereas:
	diff ./a .\a
says they are completely equal, while files a and b are character for
character identical!

Text <-> Binary mode.
It is not so much a directory structure or mount that is text or binary,
rather it is an individual file that is either text or binary. What can be
described for a mount is an intention on the production of text files on
that mount:
to use only LF or CRLF e.g.
If what I suspect in Cygwin a Textmode mount means: 
	produce text files with CRLF 

and a Binmode mount means:
	produce text files with LF
then this is somewhat confusing since both modes are actually only
concerned with text files.
The reason that they are called Textmode and Binmode nevertheless
I think is because of when you read a file and want to convert a text
file to a standard line representation with only a single LF then
you don't need to convert files with only LF. Not converting is also
precisely what you should do with binary files, hence binary mode.
I would like to make two points:
1. The User's Guide suggests that whether a command opens a file
in binary or text mode can (should?) depend on the command:
	..., all programs using lines as records (such as bash, make, 
	sed ...) would open files (and change the mode of their standard 
	input and output) as text.
(Please note that this should include diff as well since that is line
oriented as well.)
Or can depend on environment:
	Most other programs (such as cat, cmp, tr) use the default mode.
This environment includes:
	Textmode or Binmode mount. Environment variable: CYGWIN.
However all of this overlooks the fact that whether a file should be
opened in text or binary mode depends primarily on the file:
If it is binary conversions should never take place!
If it is a text file then conversion may take place. 
2. Note that for text files one could always do a conversion from 
CRLF to LF on reading independent of the mode. (Text files with 
only LF are simply left invariant.)
(All this is not to imply that it is easy to distinguish a text and a 
binary file, nor that it is always unreasonable to have
a command treat all of its input as text.)

diff.
All files compared by diff should be subject to the same conversion.

This is clearly violated by diff 2.8.7.
Please note as I said before that for text files one can always
perform a CRLF to LF conversion on reading. This should make
it more convenient to compare UNIX and Windows native files e.g.
A consequence is that if a text file and a binary file are compared
that one should not apply any conversions. (But because 
comparing text and binary files seems not very useful anyway
it probably won't make much difference if conversions are made.)
One may want to have a flag for a truly binary diff.

If I read in the User's Guide the closest I can get to what you 
said is in the paragraph:
	The default Cygwin behavior:
	a.	If the file appears to reside on a file system that is 
		mounted (i.e. if its pathname starts with a directory 
		displayed by mount), then the default is specified by 
		the mount flag. If the file is a symbolic link, the mode 
		of the target file system applies.
If I apply this to my system then I ran my test on my D: drive.
The only cygwin mount on that drive is:
	d: on /cygdrive/d type system (textmode,noumount)
Therefor in my opinion according to the User's Guide all files
on my d: drive should have been opened by diff in text mode, 
which as we saw is currently not the case.

On Fri, 4 Mar 2005, Igor Pechtchanski wrote:
>> On Fri, 4 Mar 2005, Brian Dessent wrote:
>>
>> > I cannot reproduce this, either from a bash prompt or from cmd using
>> > your .bat file:
>>
>> I can reproduce this (even under bash).  All you need is a textmode mount
>> and files with CRLF line endings.
>
>Upon re-reading this, the above seems to imply that this is a bug.  I just
>want to clarify that, as explained below, this is NOT a bug, but intended
>behavior.  The "workaround" mentioned below is for a faulty setup
>(textmode mounts and '\'s in paths), not for any perceived bug.
>	Igor
>
>> > It's probably a textmode/binmode issue, though I don't know why
>> > switching between '\' and '/' as the path seperator changes it --
>> > although the Cygwin path handling code is complex and I can't pretend to
>> > understand it.
>>
>> Having a '\' in a filename bypasses Cygwin's mounts and uses regular
>> Windows mechanisms for opening the file.  Reading a file on a textmode
>> mount will translate CRLF line endings to normal LFs.  No wonder 'diff'
>> is confused.
>>
>> > There was no attached cygcheck so I don't know how your mounts are setup
>> > but from what I've read, using textmode mounts with tools like cvs and
>> > diff is a recipe for disaster.
>>
>> Yep.  One possible workaround is to use the '--strip-trailing-cr' option
>> to diff, which will make it insensitive to textmode/binmode line endings.
>


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019