delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2006/08/05/17:05:32

X-Spam-Check-By: sourceware.org
Message-ID: <20060805210507.4049.qmail@web54206.mail.yahoo.com>
Date: Sat, 5 Aug 2006 14:05:07 -0700 (PDT)
From: Jim Lawson <jrlaim AT yahoo DOT com>
Subject: Re: NTFS fragmentation
To: cygwin AT cygwin DOT com
In-Reply-To: <1154685472.16415.ezmlm@cygwin.com>
MIME-Version: 1.0
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

> > From: Vladimir Dergachev <vdergachev AT rcgardis DOT com>
> To: cygwin AT cygwin DOT com
> Subject: Re: NTFS fragmentation
> Date: Thu, 3 Aug 2006 14:54:33 -0400
> 
> On Thursday 03 August 2006 2:37 pm, Dave Korn wrote:
> > On 03 August 2006 18:50, Vladimir Dergachev wrote:
> > > On Thursday 03 August 2006 5:18 am, Dave Korn
> wrote:
> > >> On 03 August 2006 00:46, Vladimir Dergachev
> wrote:
> > >>
> > >>
> > >>     Hi Vladimir,
> > >>
> > >>>>> Please CC me - I am not on the list.
> > >>
> > >>   Done :)
> > >>
> > >
> > > I guess this means that sequential writes are
> officially broken on NTFS.
> > >
> > > Anyone has any idea for a workaround ? It would
> be nice if a simple
> > > tar zcvf a.tgz * does not result in a completely
> fragmented file.
> >
> >   I can only think of one thing worth trying off
> the top of my head: what
> > happens if you open a file (in non-sparse mode)
> and immediately seek to the
> > file size, then seek back to the start and
> actually write the contents?  Or
> > perhaps after seeking to the end you'd need to
> write (at least) a single
> > byte, then seek back to the beginning?
> >
> 
> I am not sure that I understand, if one creates the
> file and then seeks to 
> +1G, wouldn't the file pointer be still at 0 as the
> filesize is 0 ?
> 
> What I am thinking about is modifying cygwin's open
> and write calls so that 
> they preallocate files in chunks of 10MB
> (configurable by an environment 
> variable). 
> 
> This way we still get some fragmentation, but it
> would not be so bad - 
> assuming 50MB/sec disk read speed reading 10MB will
> take 200ms, while a seek 
> is at worst 20ms (usually around 10-15ms).
> 
>                                      best
> 
>                                             Vladimir
> Dergachev
>
 
It turns out that to actually allocate the file
blocks, you need to write some data. Seeking to the
desired size doesn't (or didn't used to) actually
allocate the intervening blocks. As Dave suggests, you
need to seek to the end and actually write something
to get the file blocks allocated. If you try this for
a very large file (several Gigabytes), you had better
be prepared to go and have a nice meal while you wait
for the block allocation to complete. Window's
security policy requires that the blocks not only be
allocated, but that they be written with data as well
- ostensibly to prevent malicious code from reading
old data it shouldn't have access to.

Granted, there are better ways to do this - zero-fill
on attempts to read from allocated but uninitialized
file space or at the very least, throw some kind of
exception when an application attempts to read
uninitialized file data. Since Windows supports sparse
files, the basic mechanism is there somewhere.

Windows doesn't (or didn't use to) allow preallocation
of files without actually writing data UNLESS you know
the proper incantation to prove you're a good guy (
your application needs to do a dance to grant itself
the "SeManageVolumePrivilege" privilege so it can
issue the "SetFileValidData" call).


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019