delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2006/11/20/12:58:35

X-Spam-Check-By: sourceware.org
X-BigFish: V
From: Vladimir Dergachev <vdergachev AT rcgardis DOT com>
To: Linda Walsh <cygwin AT tlinx DOT org>
Subject: Re: NTFS fragmentation redux
Date: Mon, 20 Nov 2006 12:52:31 -0500
User-Agent: KMail/1.9.5
Cc: cygwin AT cygwin DOT com, dave DOT korn AT artimi DOT com
References: <456133E5 DOT 8000509 AT tlinx DOT org>
In-Reply-To: <456133E5.8000509@tlinx.org>
MIME-Version: 1.0
Message-Id: <200611201252.31836.vdergachev@rcgardis.com>
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id kAKHwUhS002132

On Sunday 19 November 2006 11:49 pm, Linda Walsh wrote:
> Some time back (~Aug), there was a discussion about NTFS's file
> fragmentation problem.
>
> Some notes at the time:
>
> From:  Vladimir Dergachev
>
> >        I have encountered a rather puzzling fragmentation
> > that occurs when writing files using Cygwin.
>
> ...
>
> >        a small Tcl script that, when run, creates
> > files fragmented into about 300 pieces on my system)
>
>         &&
>
> On 03 August 2006 18:50, Vladimir Dergachev wrote:
> > I guess this means that sequential writes are officially broken on NTFS.
> > Anyone has any idea for a workaround ? It would be nice if a simple
> > tar zcvf a.tgz * does not result in a completely fragmented file.
>
> 	&&
>
> On Aug  3 14:54, Vladimir Dergachev wrote:
> > What I am thinking about is modifying cygwin's open and write calls so
> > that they preallocate files in chunks of 10MB (configurable by an
> > environment variable).
>
> ------------
>
> The "fault" is the behavior of the file system.
> I compared NTFS with ext3 & xfs on linux (jfs & reiser hide how many
> fragments a file is divided into).
>
> NTFS is in the middle as far as fragmentation performance.  My disk
> is usually defragmented, but the built-in Windows defragmenter doesn't
> defragment free space.
>
> I used a file size of 64M and proceeded copying that file to
> a destination file using various utils.
>
> With Xfs (linux), I wasn't able to fragment the target file.  Even
> writing 1K chunks in append mode, the target file always ended up
> in 1 64M fragment.
>
> With Ext3 (also linux), it didn't seem to matter the copy method,
> cp, dd(blocksize 64M), and rsync all produced a target file with
> 2473 fragments.

This is curious - how do you find out fragmentation of ext3 file ? I do not 
know of a utility to tell me that. 

From indirect observation ext3 does not have fragmentation nearly that bad 
until the filesystem is close to full or I would not be able to reach 
sequential read speeds (the all-seeks speed is about 6 MB/sec for me, I was 
getting 40-50 MB/sec). This was on much larger files though.

Which journal option was the filesystem mounted with ?

>
> NTFS using cygwin, varies the fragment size based on the the tool
> writing the output.
> "cp" produced the most fragments at 515 fragments.
> "rsync" came next with 19 fragments.
> "dd" (using a bs=32M or bs=64M) did best at 1 fragment.
> using "dd" and using a block size of 8k produced the same
> results as "cp".
>
> It appears cygwin does exactly the right thing as far as file
> writes are concerned -- it writes the output using the block size
> specified by the client program you are running.  If you use a
> small block size, NTFS allocates space for each write that you do.
> If you use a big block size, NTFS appears to look for the first
> place that the entire write will fit.  Back in DOS days, the
> built-in COPY command buffered as much data as would fit in
> memory then wrote it out -- meaning it would be like to create
> the output with a minimal number of fragments.
>
> If you want your files to be unfragmented, you need to use a
> file copy (or file write) util that uses a large buffer size --
> one that (if possible), writes the entire file in 1 write.

I actually implemented a workaround that calls "fsutil file createnew 
FILESIZE" to preallocate space and then write data in append mode
(after doing seek 0).

                thank you !

                        Vladimir Dergachev

>
> In the "tar zcvf a.tgz *" case, I'd suggest piping the output of
> tar into "dd" and use a large blocksize.
>
> Linda



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019