Mail Archives: cygwin/2003/05/18/18:06:05
This patch is a bad idea.
2003-02-18  Vaclav Haisman  <V DOT Haisman AT sh DOT cvut DOT cz>
	* fhandler_disk_file.cc: Include winioctl.h for DeviceIoControl.
	(fhandler_disk_file::open): Set newly created and truncated files as
	sparse on platforms that support it.
As someone on the mailing list asked, "If making every file sparse is
such a good idea, why isn't it the default?".
My experience has been that for me, sparse files take up much more
disk space than non-sparse files, and are also signicantly slower.
I build software.  My build trees have 50000 files, average size 8k.
When I copied build trees to a Win2000 NTFS disk using Cygwin tools
(either cp or tar or rsync) the actual space used on the disk (as
reported by df, not du) quintupled.
Here's what I think is happening.  Sparse files are implemented like
compressed files, using 16 clusters.  See this web page:
http://www.storageadmin.com/Articles/Index.cfm?ArticleID=15900&pg=1&show=654
As a result, a non-empty but small sparse file takes up a minimum of
16*clustersize bytes on the disk.  My measurements suggest an overhead
of 32kb per file with a cluster size of 4kb.
Here are some experiments to support my results:
MKS's commands creates files 5 times smaller than Cygwin commands.
----------------------------------------------------------------
In 1.3.22:
cpdir is a trivial script that does basically 
(cd $dir1; tar cf - .) | (cd $dir2; tar xf -)
`cp -pr' works the same way.
# Use Cygwin commands to create a huge file tree
#
$ df .; cpdir dev2 copy-of-dev2; df .
Filesystem    Type   1M-blocks      Used Available Use% Mounted on
d:          system       11492      6001      5491  53% /d
==> mkdir -p copy-of-dev2
cpdir dev2 copy-of-dev2  17.46s user 53.72s system 18% cpu 6:33.99 total
Filesystem    Type   1M-blocks      Used Available Use% Mounted on
d:          system       11492      8438      3054  74% /d
$ du -sm dev2 copy-of-dev2
419	dev2
419	copy-of-dev2
du -h -sm dev2 copy-of-dev2  5.64s user 16.36s system 76% cpu 28.784 total
----------------------------------------------------------------
After reverting to 1.3.20, or patching latest CVS:
I used this method to reclaim disk space that was eaten up by the
SPARSE file disk hog.
$ df .; mv ws ws-old; cpdir ws-old ws; df .
Filesystem    Type   1M-blocks      Used Available Use% Mounted on
d:          system       11492      6910      4582  61% /d
==> mkdir -p ws
cpdir ws-old ws  58.68s user 225.50s system 19% cpu 23:44.30 total
Filesystem    Type   1M-blocks      Used Available Use% Mounted on
d:          system       11492      9085      2407  80% /d
$ df .; rm -rf ws-old; df .
Filesystem    Type   1M-blocks      Used Available Use% Mounted on
d:          system       11492      9085      2407  80% /d
rm -rf ws-old  21.86s user 71.33s system 38% cpu 4:01.85 total
Filesystem    Type   1M-blocks      Used Available Use% Mounted on
d:          system       11492      3689      7803  33% /d
----------------------------------------------------------------
I'm sure if you do the experiments yourself, you will see this for
yourself.  To reproduce this problem, you need NTFS 5.0 on Windows
2000.  Sparse files are a recent NTFS feature.
The patch is obvious, but I'll send it to cygwin-patches anyways.
Without this patch, Cygwin is unusable for me.
Martin
--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/
- Raw text -