Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Date: Wed, 15 Oct 2003 19:31:15 -0700 From: jw schultz To: rsync list , cygwin AT cygwin DOT com Subject: [alert] DST change and date comparisons Message-ID: <20031016023115.GB26304@pegasys.ws> Mail-Followup-To: jw schultz , rsync list , cygwin AT cygwin DOT com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.27i X-Message-Flag: Unauthorised duplication and storage of this email is a violation of international copyright law and is subject to prosecution. Description of Problem We are rapidly approaching the time of year when some will transition from standard time (ST) to daylight savings time (DST) and others will make the opposite transition. These vernal and autumnal transitions have important implications for those with Microsoft systems and use utilities that compare file timestamps on different filesystem types or with filesystems on other operating systems. The problem lies in the way FAT filesystems stores timestamps and how Windows converts between local time and UTC. In UNIX and UNIX-like systems such as Linux file timestamps are stored in UTC (universal time) and are only converted to local-time by user-space programs for display purposes. At the system call level all time values are in UTC and utilities that compare timestamps do so in UTC. Also, the standard UTC->local and local->UTC conversion functions are aware of DST and conversions reflect this so that if a timestamp was recorded during ST it will be converted using the ST offset even when the current system time is DST. In Windows things are not so simple. Windows operates in local-time. Timestamps in the various FAT derived filesystems are stored in local-time. Timestamps in NTFS filesystems are stored in UTC. This inconsistency is further complicated by the fact that the conversion routines used are not DST aware. Instead of being DST aware the system has a fixed offset to convert between local-time and UTC regardless of the date in the timestamp. This fixed offset is calculated at boot time and only changed when systems transition to or from DST. As a result the apparent modification time of a file on NTFS as reported in a windows utility will change by one hour when reported in local-time and FAT based files when reported in UTC. The difficulty that this produces is that any utilities that compare timestamps between FAT and NTFS filesystems or between Windows and other platforms will view files that have not changed as being different. Among other things this will affect rsync, rdiff, unison, wget and make. With the reduced cost of hard disks many newer backup systems are using hard disk based storage and take advantage of timestamp comparison to detect file changes for the sake of efficiency. Rsync is probably premier in this role and is used by a fair number of free and even commercial backup systems as well as being the basis for many home-brew backup solutions. With rsync and similar systems the effect of this is that every file will appear to have been changed. The result is any space savings associated with linking (--link-dest) or with decremental backup approaches (--compare-dest and --backup-dir) will be defeated. Perhaps worse, because every file will appear to have changed the time required to do a backup or a non-backup rsync will be much longer than normal. In some cases backups that normally complete in less than one hour can take several days. So what can be done about it? Several things, there are ways to merely mitigate the problem, to correct it and finally to prevent the problem entirely. Mitigation Rsync has a --modify-window option. Many of you already use --modify-window=1 to cope with the fact that windows often stores timestamps with a two second resolution. Using a --modify-window=3601 will cause rsync to ignore timestamp differences of up to one hour. This isn't particularly dangerous since a file would have to be changed, synced and changed again without changing size within a single hour and have no subsequent changes for this to miss a file change. For this to happen during the hours most backups occur is improbable. It may be advisable to activate this mode prior to the time change in order to delay the issue until you know exactly how you are affected. Correction There are two ways to correct. If you are running in mitigation mode you can do an rsync with a normal modify-window on subsets of your data. Once all of a given directory tree has been corrected you can return to the normal modify-window. The other way to correct things is to change the timestamps on the files modified before the change. This can be done afterwards or, if you know in advance exactly what will happen it may be done in advance. I include here an example perl script that will change the timestamps of files in a list on standard-input. Whether you use a positive or negative shift will depend on which end you decide to adjust. I include here a script to do exactly that. This is an example of how to use the script: touch -d '01:00 13-apr-03' /tmp/cmpfile find . -type f ! -newer /tmp/cmpfile | shifttime.pl 3600 ------ shifttime.pl ------- #!/usr/bin/perl $offset = shift; $offset += 0; !$offset and die "must specify offset"; while () { chomp; -w or next; $oldtime = (stat $_)[9]; $oldtime or next; $newtime = $oldtime + $offset; utime $newtime, $newtime, $_; } ------ end ------------------- Prevention: To prevent the problem in the first place you need to prevent changing to DST. This can be done by either running the windows system in UTC or by disabling DST and changing the system time manually twice each year. Notes and References Here are some references that Wayne Piekarski collected while researching this problem. They contain a lot of information about the ways that Windows deals with timestamps on NTFS and FAT filesystems. http://optics.ph.unimelb.edu.au/help/rsync/rsync_pc1.html#gotchas http://list-archive.xemacs.org/xemacs-nt/199911/msg00130.html http://p2p.wrox.com/archive/c_plus_plus_programming/2001-06/53.asp http://www.codeproject.com/datetime/dstbugs.asp http://support.microsoft.com/default.aspx?scid=kb;[LN];158588 I wish to thank Wayne Piekarski for having copiled the references and also supplying some additional insights. I myself do not use Windows systems. What i report here is based on the reports of others and such documentation as i a have found. I have not tested included program code or examples on windows systems so care should be exersised in their use. A success report would not be amis nor would difinitive info regarding the impact of cygwin on this. Permission is granted without reservation reprint and distribute this in whole and in part to any interested parties. J.W. Schultz jw AT pegasys DOT ws -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/