delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2003/05/18/18:26:02

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Message-ID: <003901c31d8c$6ec495f0$78d96f83@pomello>
From: "Max Bowsher" <maxb AT ukf DOT net>
To: <martin AT xemacs DOT org>, <cygwin AT cygwin DOT com>
References: <16072 DOT 892 DOT 778395 DOT 24290 AT gargle DOT gargle DOT HOWL>
Subject: Re: SPARSE files considered harmful - please revert
Date: Sun, 18 May 2003 23:25:48 +0100
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165

Martin Buchholz wrote:
> This patch is a bad idea.
>
> 2003-02-18  Vaclav Haisman  <V DOT Haisman AT sh DOT cvut DOT cz>
> * fhandler_disk_file.cc: Include winioctl.h for DeviceIoControl.
> (fhandler_disk_file::open): Set newly created and truncated files as
> sparse on platforms that support it.
>
> As someone on the mailing list asked, "If making every file sparse is
> such a good idea, why isn't it the default?".

Me, I think.

I agree that making *all* files sparse is a bad idea, but let me just say
that I consider the usage of the words "considered harmful" harmful to
effective discussion. It sets the scene for antagonistic flamewars. Ditto
"revert".

Actual test data, as you give below is *good*, though:

> My experience has been that for me, sparse files take up much more
> disk space than non-sparse files, and are also signicantly slower.
>
> I build software.  My build trees have 50000 files, average size 8k.
> When I copied build trees to a Win2000 NTFS disk using Cygwin tools
> (either cp or tar or rsync) the actual space used on the disk (as
> reported by df, not du) quintupled.
>
> Here's what I think is happening.  Sparse files are implemented like
> compressed files, using 16 clusters.  See this web page:
>
>
http://www.storageadmin.com/Articles/Index.cfm?ArticleID=15900&pg=1&show=654
>
> As a result, a non-empty but small sparse file takes up a minimum of
> 16*clustersize bytes on the disk.  My measurements suggest an overhead
> of 32kb per file with a cluster size of 4kb.
>
> Here are some experiments to support my results:
> MKS's commands creates files 5 times smaller than Cygwin commands.
...
> I'm sure if you do the experiments yourself, you will see this for
> yourself.  To reproduce this problem, you need NTFS 5.0 on Windows
> 2000.  Sparse files are a recent NTFS feature.
>
> The patch is obvious, but I'll send it to cygwin-patches anyways.
>
> Without this patch, Cygwin is unusable for me.

May I suggest a middle road? Why not let sparse files be configurable as a
$CYGWIN option? This would allow those users who actually want them to
enable them with minimal effort, but keep them off for most users.

Max.


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019