X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org Message-ID: <4D029094.9050801@gmx.de> Date: Fri, 10 Dec 2010 21:41:56 +0100 From: Matthias Andree User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.2.13) Gecko/20101207 Lightning/1.0b2 Thunderbird/3.1.7 MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: 1.7.7: rm -rf sometimes fails - race condition? References: <4D026815 DOT 4070606 AT gmx DOT de> <20101210182652 DOT GA27615 AT ednor DOT casa DOT cgf DOT cx> In-Reply-To: <20101210182652.GA27615@ednor.casa.cgf.cx> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Am 10.12.2010 19:26, schrieb Christopher Faylor: > On Fri, Dec 10, 2010 at 06:49:09PM +0100, Matthias Andree wrote: >>Greetings, >> >>I see that "rm -rf" on a directory sometimes fails, like here: >> >>|>>> Creating source package >>| fetchmail-6.3.19-1.cygport >>| fetchmail-6.3.19-1.cygwin.patch >>| fetchmail-6.3.19.tar.bz2 >>|>>> Removing work directory in 5 seconds... >>|>>> Removing work directory NOW. >>| rm: cannot remove `/usr/src/fetchmail-6.3.19-1/inst/usr/share/locale/da': >>Directory not empty >>| Command exited with non-zero status 1 >> >>Alternatively, you get "...in use" for an error, however, in this case, it >>appears that the corresponding syscall triggered by rm(1) had already returned >>but the file wasn't fully removed from the directory yet. >> >>I've seen this happen for a while now. This happens sporadically, and retrying >>the operation usually succeeds, so it matters less in an interactive shell. >>However, this often breaks scripts, in this case, cygport. >> >>This looks like either a premature return from a syscall or libcall, or like a >>genuine race in the system. >> >>In case it matters, this is >>- Windows 7 Prof. 32-bit German >>- with Sophos Endpoint Security and Control ver. 9 and >>- Microsoft Windows Defender. >>- coreutils 8.5-2 >>- uname -a: >> CYGWIN_NT-6.1 somehost 1.7.7(0.230/5/3) 2010-08-31 09:58 i686 Cygwin >> >> >>Has anyone seen similar things? > > Yes and you seem to have nailed the problem - it happens when a virus checker > hooks into a syscall and allows it to return before completion. I don't think > we want to modify Cygwin to not trust success return values from system calls. The interesting part is that it also happens with Sophos deinstalled and the Defender service stopped... so it's not apparently Sophos that causes this. -- Matthias Andree -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple