X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=0.1 required=5.0 tests=AWL,BAYES_50,TW_XZ,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Subject: Re: untarring symlinks with ../ fails randomly, silghtly OT From: Wolf Geldmacher To: cygwin AT cygwin DOT com In-Reply-To: <20110705121059.GI1457@calimero.vinschen.de> References: <1309437783 DOT 2097 DOT 68 DOT camel AT geldmacher-pc> <20110630133703 DOT GE9552 AT calimero DOT vinschen DOT de> <4E0C90B2 DOT 2060409 AT cornell DOT edu> <1309447688 DOT 12904 DOT 21 DOT camel AT geldmacher-pc> <1309770955 DOT 22699 DOT 15 DOT camel AT geldmacher-pc> <20110704104656 DOT GA20822 AT calimero DOT vinschen DOT de> <20110705121059 DOT GI1457 AT calimero DOT vinschen DOT de> Content-Type: text/plain; charset="UTF-8" Date: Tue, 05 Jul 2011 17:53:03 +0200 Message-ID: <1309881183.22699.96.camel@geldmacher-pc> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Tue, 2011-07-05 at 14:10 +0200, Corinna Vinschen wrote: > On Jul 4 12:46, Corinna Vinschen wrote: > > On Jul 4 11:15, Wolf Geldmacher wrote: > > > As an aside: > > > I also used to have some trouble with "rm -rf" of a directory > > > hierarchy failing more or less reproducibly (like: 80% of the > > > time) because files were presumably still "in use". Repeating > > > the command several times would succeed, though. > > > > > > Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 > > > seems to have solved that issue as well - still have to see > > > the first "retry to delete". > > > > > > This may or may not be related to the original report, as it also reeks > > > of a race condition during file/directory operations. > > > > I can neither reproduce the tar problem, nor can I reprocude the rm > > problem. I tried this under 2008R2 which is basically the same as your > > W7-64 bit. I used local and remote drives to test the issue but to no > > avail. > > Finally I managed to reproduce the problem and now I see what happens. > > Windows does not write back the file change timestamp unless the file > buffers are flushed. This usually occurs at close time. In contrast to > POSIX specifications the timestamps are *not* automatically updated when > a call to fetch file metadata is performed. > > Here's what tar does when creating the symlink: > > 1. create file with 000 permissions > 2. fstat > 3. close file > [...] > 4. stat file > 5. if fstat.st_ctime != stat.st_ctime ==> symlink placeholder has been > overwritten. > > The problem is that the call to fstat on the opened handle gets some > value of the change time timestamp, but the subsequent close changes > the timestamp again. > > Speculation: It seems that the timestamp fstat sees is the timestamp > created at the time NtCreateFile is called, while the timestamp from the > call to NtSetSecurityFile to change the DACL is cached and only updated > when calling NtClose. > > This also explains why this doesn't occur in 1.7.8. In 1.7.8, the DACL > has been written using another file handle, because the original handle > didn't have the right to change the DACL. By adding the WRITE_DAC flag, > I allowed Cygwin to use the original file handle to write the DACL. The > difference is: > > 1.7.8: > > - create file > - open file for writing the DACL > - write DACL > - close > - do whatever the orignal handle was opened for > - close > > 1.7.9: > > - create file > - write DACL > - do whatever the orignal handle was opened for > - close > > So, with 1.7.9 the close call after writing the DACL is missing, which > accounts for the missing flushing of the file metadata. > > By calling FlushFileBuffers in fstat before calling NtQueryInformationFile > I can fix the problem. Unfortunately that slows down applications like tar, > which use fstat a lot, a lot. > > There are two solutions, one is reverting to the 1.7.8 state, which > means, writing the DACL requires to open the file again, or calling > FlushFileBuffers in fstat. > I compared both solutions. On my hardware, calling tar xzf on your file > is 500% slower if fstat calls FlushFileBuffers compared to just dropping > the WRITE_DAC flag from the open call. Wow! Imagine that I added the > WRITE_DAC flag to gain performance... > > So I guess this all boils down to the fact that adding WRITE_DAC was > not really a good move. It's a shame that Windows punishes every try > to speed up file operations with a raise in non-POSIXy behaviour :-((( > > I changed that in CVS and right now I'm generating a new developer > snapshot on http://cygwn.com/snapshots/. Give it a try, please. > > > Thanks, > Corinna I downloaded and installed the daily dll: I can no longer reproduce the "failing symlink" problem at all which was 100% reproducible before. So it looks like your diagnosis and the fix are correct. Thank you very much for your support! Regarding the "rm -rf failing" problem: Although I could no longer reproduce the issue on the test machine when I downgraded to the older dll, it *did* happen yesterday night on the nightly build with a 1.7.8 cygwin1.dll - so it seems to be unrelated to the WRITE_DAC change, which incidentially also agrees with Ryan's test results. Thanks again & Regards Wolf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple