delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2003/10/15/22:32:26

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Date: Wed, 15 Oct 2003 19:31:15 -0700
From: jw schultz <jw AT pegasys DOT ws>
To: rsync list <rsync AT samba DOT org>, cygwin AT cygwin DOT com
Subject: [alert] DST change and date comparisons
Message-ID: <20031016023115.GB26304@pegasys.ws>
Mail-Followup-To: jw schultz <jw AT pegasys DOT ws>,
rsync list <rsync AT samba DOT org>, cygwin AT cygwin DOT com
Mime-Version: 1.0
User-Agent: Mutt/1.3.27i
X-Message-Flag: Unauthorised duplication and storage of this email is a violation of international copyright law and is subject to prosecution.

		Description of Problem

We are rapidly approaching the time of year when some will
transition from standard time (ST) to daylight savings time
(DST) and others will make the opposite transition.  These
vernal and autumnal transitions have important implications
for those with Microsoft systems and use utilities that
compare file timestamps on different filesystem types or
with filesystems on other operating systems.

The problem lies in the way FAT filesystems stores
timestamps and how Windows converts between local time and
UTC. 

In UNIX and UNIX-like systems such as Linux file timestamps
are stored in UTC (universal time) and are only converted to
local-time by user-space programs for display purposes.  At
the system call level all time values are in UTC and
utilities that compare timestamps do so in UTC.  Also, the
standard UTC->local and local->UTC conversion functions are
aware of DST and conversions reflect this so that if a
timestamp was recorded during ST it will be converted using
the ST offset even when the current system time is DST.

In Windows things are not so simple.  Windows operates in
local-time.  Timestamps in the various FAT derived
filesystems are stored in local-time.  Timestamps in NTFS
filesystems are stored in UTC.  This inconsistency is
further complicated by the fact that the conversion routines
used are not DST aware.  Instead of being DST aware the
system has a fixed offset to convert between local-time and
UTC regardless of the date in the timestamp.  This fixed
offset is calculated at boot time and only changed when
systems transition to or from DST.  As a result the apparent
modification time of a file on NTFS as reported in a windows
utility will change by one hour when reported in local-time
and FAT based files when reported in UTC.

The difficulty that this produces is that any utilities that
compare timestamps between FAT and NTFS filesystems or
between Windows and other platforms will view files that
have not changed as being different.  Among other things
this will affect rsync, rdiff, unison, wget and make.

With the reduced cost of hard disks many newer backup
systems are using hard disk based storage and take advantage
of timestamp comparison to detect file changes for the sake
of efficiency.  Rsync is probably premier in this role and
is used by a fair number of free and even commercial backup
systems as well as being the basis for many home-brew backup
solutions.

With rsync and similar systems the effect of this is that
every file will appear to have been changed.  The result is
any space savings associated with linking (--link-dest) or
with decremental backup approaches (--compare-dest and
--backup-dir) will be defeated.  Perhaps worse, because
every file will appear to have changed the time required to
do a backup or a non-backup rsync will be much longer than
normal.  In some cases backups that normally complete in
less than one hour can take several days.

So what can be done about it?  Several things, there are
ways to merely mitigate the problem, to correct it and finally
to prevent the problem entirely.

			Mitigation

Rsync has a --modify-window option.  Many of you already use
--modify-window=1 to cope with the fact that windows often
stores timestamps with a two second resolution.  Using a
--modify-window=3601 will cause rsync to ignore timestamp
differences of up to one hour.

This isn't particularly dangerous since a file would have to
be changed, synced and changed again without changing size
within a single hour and have no subsequent changes for this
to miss a file change.  For this to happen during the hours
most backups occur is improbable.

It may be advisable to activate this mode prior to the time
change in order to delay the issue until you know exactly
how you are affected.

			Correction

There are two ways to correct.

If you are running in mitigation mode you can do an rsync
with a normal modify-window on subsets of your data.  Once
all of a given directory tree has been corrected you can
return to the normal modify-window.

The other way to correct things is to change the timestamps
on the files modified before the change.  This can be done
afterwards or, if you know in advance exactly what will
happen it may be done in advance. 

I include here an example perl script that will change the
timestamps of files in a list on standard-input.  Whether
you use a positive or negative shift will depend on which
end you decide to adjust.  I include here a script to do
exactly that.

This is an example of how to use the script:
	touch -d '01:00 13-apr-03' /tmp/cmpfile
	find . -type f ! -newer /tmp/cmpfile | shifttime.pl 3600

------ shifttime.pl -------
#!/usr/bin/perl

$offset = shift;

$offset += 0;

!$offset and die "must specify offset";

while (<STDIN>)
{
	chomp;
	-w or next;
	$oldtime = (stat $_)[9];
	$oldtime or next;
	$newtime = $oldtime + $offset;
	utime $newtime, $newtime, $_;
}
------ end -------------------

			Prevention:

To prevent the problem in the first place you need to
prevent changing to DST.  This can be done by either running
the windows system in UTC or by disabling DST and changing
the system time manually twice each year. 


			Notes and References

Here are some references that Wayne Piekarski collected
while researching this problem.  They contain a lot of
information about the ways that Windows deals with
timestamps on NTFS and FAT filesystems.

http://optics.ph.unimelb.edu.au/help/rsync/rsync_pc1.html#gotchas
http://list-archive.xemacs.org/xemacs-nt/199911/msg00130.html
http://p2p.wrox.com/archive/c_plus_plus_programming/2001-06/53.asp
http://www.codeproject.com/datetime/dstbugs.asp
http://support.microsoft.com/default.aspx?scid=kb;[LN];158588

I wish to thank Wayne Piekarski for having copiled the
references and also supplying some additional insights.

I myself do not use Windows systems.  What i report here is
based on the reports of others and such documentation as i
a have found.  I have not tested included program code or
examples on windows systems so care should be exersised in
their use.  A success report would not be amis nor would
difinitive info regarding the impact of cygwin on this.

Permission is granted without reservation reprint and
distribute this in whole and in part to any interested
parties.

					J.W. Schultz
					jw AT pegasys DOT ws

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019