Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <003901c31d8c$6ec495f0$78d96f83@pomello> From: "Max Bowsher" To: , References: <16072 DOT 892 DOT 778395 DOT 24290 AT gargle DOT gargle DOT HOWL> Subject: Re: SPARSE files considered harmful - please revert Date: Sun, 18 May 2003 23:25:48 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 Martin Buchholz wrote: > This patch is a bad idea. > > 2003-02-18 Vaclav Haisman > * fhandler_disk_file.cc: Include winioctl.h for DeviceIoControl. > (fhandler_disk_file::open): Set newly created and truncated files as > sparse on platforms that support it. > > As someone on the mailing list asked, "If making every file sparse is > such a good idea, why isn't it the default?". Me, I think. I agree that making *all* files sparse is a bad idea, but let me just say that I consider the usage of the words "considered harmful" harmful to effective discussion. It sets the scene for antagonistic flamewars. Ditto "revert". Actual test data, as you give below is *good*, though: > My experience has been that for me, sparse files take up much more > disk space than non-sparse files, and are also signicantly slower. > > I build software. My build trees have 50000 files, average size 8k. > When I copied build trees to a Win2000 NTFS disk using Cygwin tools > (either cp or tar or rsync) the actual space used on the disk (as > reported by df, not du) quintupled. > > Here's what I think is happening. Sparse files are implemented like > compressed files, using 16 clusters. See this web page: > > http://www.storageadmin.com/Articles/Index.cfm?ArticleID=15900&pg=1&show=654 > > As a result, a non-empty but small sparse file takes up a minimum of > 16*clustersize bytes on the disk. My measurements suggest an overhead > of 32kb per file with a cluster size of 4kb. > > Here are some experiments to support my results: > MKS's commands creates files 5 times smaller than Cygwin commands. ... > I'm sure if you do the experiments yourself, you will see this for > yourself. To reproduce this problem, you need NTFS 5.0 on Windows > 2000. Sparse files are a recent NTFS feature. > > The patch is obvious, but I'll send it to cygwin-patches anyways. > > Without this patch, Cygwin is unusable for me. May I suggest a middle road? Why not let sparse files be configurable as a $CYGWIN option? This would allow those users who actually want them to enable them with minimal effort, but keep them off for most users. Max. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/