delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2006/08/02/18:33:19

X-Spam-Check-By: sourceware.org
X-BigFish: V
From: Vladimir Dergachev <vdergachev AT rcgardis DOT com>
To: cygwin AT cygwin DOT com
Subject: NTFS fragmentation
Date: Wed, 2 Aug 2006 18:33:04 -0400
User-Agent: KMail/1.9.3
MIME-Version: 1.0
Message-Id: <200608021833.04775.vdergachev@rcgardis.com>
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

--Boundary-00=_ggS0Eg1ePLepXPE
Content-Type: text/plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline


Hi all, 

       I have encountered a rather puzzling fragmentation that occurs when 
writing files using Cygwin. 

       What happens is that if one creates a new file and writes data to it
(whether via a command line redirect or with a Tcl script - have not tried C 
yet) the file ends up heavily fragmented. 

       In contrast, native Windows utilities do not exhibit this issue.

       Someone suggested to me that Windows requires an expected file length 
to be passed at the time of open, thus I searched on Google and 
found "fsutil" program that allows to reserve space on the filesystem.

       I attached a small Tcl script that, when run, creates two 30 MB files - 
one using regular open/write pair (and which is fragmented into about 300 
pieces on my system) and one using fsutil/open in append mode/seek 0 method.

      To see the problem defragment your system, run the test script and then 
run analyze and ask to view report. You will see a.dat at top of the list, 
while b.dat never appears in the report. 

       Despite the workaround, it is still kinda hard for me to believe that 
anyone has designed a filesystem that needs to know what is the file size 
going to be - especially for a single program writing on an almost empty 
disk. Perhaps there is some sort of environment variable that I need to set ?

        Any suggestions and comments would be greatly appreciated.        
Please CC me - I am not on the list.

                           thank you very much

                                        Vladimir Dergachev


--Boundary-00=_ggS0Eg1ePLepXPE
Content-Type: text/x-tcl;
  charset="us-ascii";
  name="ntfs_test.tcl"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename="ntfs_test.tcl"

#!/usr/bin/env tclsh

foreach {var value} {
	FRAGMENTED_FILE  "a.dat"
	CONTIGUOUS_FILE  "b.dat"
	FILE_SIZE	30000000
	BUFFER_SIZE	1000000
	} {
	global $var
	set $var $value
	}

proc fragmented_write { filename size buffersize } {
puts stderr "DELETING $filename"
file delete $filename
puts stderr "fragmented_write $filename"
set F [open $filename "w"]
fconfigure $F -buffering full -buffersize buffersize

for { set i 0 } { $i < $size } { incr i } {
	puts -nonewline $F "X"
	}
close $F
puts stderr "closing $filename"
}

proc contiguous_write { filename size buffersize } {
puts stderr "DELETING $filename"
file delete $filename
puts stderr "preallocating $filename"
exec fsutil file createnew [file nativename $filename] $size
puts stderr "contiguous_write $filename"
set F [open $filename "a"]
fconfigure $F -buffering full -buffersize buffersize
seek $F 0 start

for { set i 0 } { $i < $size } { incr i } {
	puts -nonewline $F "X"
	}
close $F
puts stderr "closing $filename"
}

fragmented_write $FRAGMENTED_FILE $FILE_SIZE $BUFFER_SIZE

contiguous_write $CONTIGUOUS_FILE $FILE_SIZE $BUFFER_SIZE


--Boundary-00=_ggS0Eg1ePLepXPE
Content-Type: text/plain; charset=us-ascii

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/
--Boundary-00=_ggS0Eg1ePLepXPE--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019