delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2003/05/20/12:42:13

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Message-ID: <015f01c31edf$3b235f20$6400a8c0@FoxtrotTech0001>
From: "Bill C. Riemers" <cygwin AT docbill DOT net>
To: <cygwin AT cygwin DOT com>
References: <16072 DOT 6666 DOT 10124 DOT 338022 AT gargle DOT gargle DOT HOWL> <00f301c31e12$c29efdb0$6400a8c0 AT FoxtrotTech0001> <00be01c31e15$944d0d50$78d96f83 AT pomello> <005601c31e26$77671260$6400a8c0 AT FoxtrotTech0001> <20030519175913 DOT GA24066 AT redhat DOT com> <008001c31e5e$39c0c680$6400a8c0 AT FoxtrotTech0001> <20030520024151 DOT GA1812 AT redhat DOT com>
Subject: Re: SPARSE files considered harmful - please revert
Date: Tue, 20 May 2003 10:50:56 -0400
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165

> 1) You are assuming behavior that isn't documented.  I can imagine that
> the first block could occupy, say 16 blocks and depending on the size of
> the hole, there could be no fragmentation.

You are assuming an optimization that may or may not exist.   In my example,
there is certainly no reason why the first block would occupy 16 blocks.  I
already specified the hole is exactly one block size.  At most the file
system may allocate 3 blocks, so the middle one could be filled later.  But
even in that case you would still get fragmentation as a result.   However,
the fragmentation would more likely result from a one block file being
written into the "reserved" space, before it is needed for the updated
sparse file.  Either way use of a sparse file for a file that is regularly
accessed in RW mode will result in fragmentation.  The only question is how
fast it will fragment.  That behavior depends on the filesystem, and how the
drivers are implemented.

Really sophisticated drivers might even do things like rewrite the file if
it is below
a threshold size, just to fix fragmentation on the fly.  I can definitely
say NTFS is
not that sophisticated.  Even on disks with a large amount of free space
NTFS
fragments at an alarmingly fast rate.  I defragment Linux partition once
every few
years at most (by repartitioning and copying).  Any more frequent and there
is no
noticeable improvement in performance.  For NTFS I find I need to run the
defragmenter
every weekend for optimal performance.

> 2) Normal read/write behavior would not result in a file that has a
> sparse block.  I think it is a rare program which writes beyond EOF.  So
> this would normally be a non-issue.

Correct.  I am only talking about why it is bad idea to blindly convert all
files to sparse files.   This can be done with either GNU tar or GNU cp.
The above fragmentation behavior is going to happen and does happen when
the file in question is a database file, since databases tend to contain
lots of blank space intended for adding new records.

> 3) What no one seems to be mentioning is that we are trying to emulate
> UNIX behavior here.  If the above is an issue for Windows then it could
> also be an issue for UNIX.

It sounds like we are really on the same page, but discussing different
issues.
CYGWIN should definitely support creating sparse files in the classical Unix
method of seeking beyond the end of the file.  From what I've seen in this
discussion
it already does, and that is not an issue.  What I'm arguing is that files
should not
be blindly converted into sparse files with GNU tar -S, GNU
cp --sparse=always, etc.

If for example, you convert a database file into a sparse file, it is not
uncommon for the
fragmentation to reduce database access times by an order of magnitude or
more.

                                        Bill



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019