Mail Archives: cygwin/2002/07/25/03:45:42
The reason I got interested in this was 5.8.0's breaking of code
working in 5.6.1. The code compared the number of bytes in the
internal representation of an email message with the number
stored in the file.
Here is the result of my earlier script run on 5.6.1
For underlying /binary/ mount mode
Discipline: default String length: 8 File size: 8
Discipline: binary String length: 8 File size: 8
Discipline: text String length: 10 File size: 10
For underlying /text/ mount mode
Discipline: default String length: 10 File size: 10
Discipline: binary String length: 8 File size: 8
Discipline: text String length: 10 File size: 10
Here is the same script on 5.8.0:
For underlying /binary/ mount mode
Discipline: default String length: 8 File size: 10
Discipline: binary String length: 8 File size: 8
Discipline: text String length: 8 File size: 10
For underlying /text/ mount mode
Discipline: default String length: 8 File size: 10
Discipline: binary String length: 8 File size: 8
Discipline: text String length: 8 File size: 10
If some of the values were 'wrong' under 5.6.1, at least they
were equal :-) With 5.8.0, it is finding the 'right' string
length in all cases, but now this value is only equal to the file
size when binmode() is used (ie writing to a Unix style file is
forced), even on an underlying binary mode mount.
It appears the following from perldoc
perlcygwin is no longer an adequate account of what is happening.
o Text/Binary
When a file is opened it is in either text or binary
mode. In text mode a file is subject to CR/LF/Ctrl-Z
translations. With Cygwin, the default mode for an
open() is determined by the mode of the mount that
underlies the file. Perl provides a binmode() func-
tion to set binary mode on files that otherwise would
be treated as text. sysopen() with the "O_TEXT" flag
sets text mode on files that otherwise would be
treated as binary:
It appears that it is no longer just a choice between writing to
a binary mode mount or with binmode, as opposed to a text mode
mount or with O_TEXT.
According to perldoc perldelta
o Previous versions of perl and some readings of some
sections of Camel III implied that ":raw" "discipline"
was the inverse of ":crlf". Turning off "clrfness"
is no longer enough to make a stream truly binary. So
the PerlIO ":raw" discipline is now formally defined
as being equivalent to binmode(FH) - which is in turn
defined as doing whatever is necessary to pass each
byte as-is without any translation. In particular
binmode(FH) - and hence ":raw" - will now turn off
both CRLF and UTF-8 translation and remove other
"layers" (e.g. :encoding()) which would modify byte
stream.
This seems to be a consequence of the new IO,
o IO is now by default done via PerlIO rather than sys-
tem's "stdio". PerlIO allows "layers" to be "pushed"
onto a file handle to alter the handle's behaviour.
Layers can be specified at open time via 3-arg form of
open:
open($fh,'>:crlf :utf8', $path) || ...
or on already opened handles via extended "binmode":
binmode($fh,':encoding(iso-8859-7)');
The built-in layers are: unix (low level read/write),
stdio (as in previous Perls), perlio (re-implementa-
tion of stdio buffering in a portable manner), crlf
(does CRLF <=> "\n" translation as on Win32, but
available on any platform). A mmap layer may be
available if platform supports it (mostly UNIXes).
Layers to be applied by default may be specified via
the 'open' pragma.
perldoc perlio says about defaults:
If the platform is MS-DOS like and normally does CRLF to
"\n" translation for text files then the default layers
are :
unix crlf
(The low level "unix" layer may be replaced by a platform
specific low level layer.)
Otherwise if "Configure" found out how to do "fast" IO
using system's stdio, then the default layers are :
unix stdio
Otherwise the default layers are
unix perlio
...
The default can be overridden by setting the environment
variable PERLIO to a space separated list of layers (unix
or platform low level layer is always pushed first).
...
cd .../perl/t
PERLIO=stdio ./perl harness
PERLIO=perlio ./perl harness
So my earlier script may have been an adequate test bed for
5.6.1. The read on the file used a default open, and the string
read in seemed to reflect what had been written to the file. With
5.8.0 however, the read with a default open appears to be doing a
translation of CRLF to \n, because the platform is 'MS-DOS like'.
I need a script to test the effects of the various layers.
In any case, looking at the results of the earlier script for
5.8.0 at the top of the email and comparing them with those for
5.6.1, it also appears that default writes to a file, EVEN IF ON
AN UNDERLYING BINARY MOUNT, will now leave CRs in the file. This
is something that people won't be too happy about, I think.
--
Greg Matheson The best jokes are
Chinmin College those you play on
yourself.
Taiwan Penpals Archive <URL: http://netcity.hinet.net/kurage>
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -