Mail Archives: cygwin/2011/10/23/16:54:03
Eric Blake wrote:
> POSIX is clear that attempts to rmdir() a directory that still has
> open descriptors may fail. Of course, on Linux, this (rather
> limiting) restriction is not present, so we don't notice it; but on
> Cygwin, there are certain file systems where this is a real problem,
> such as in this thread:
> http://cygwin.com/ml/cygwin/2011-10/msg00365.html
>
> Looking at an strace on Linux reveals the problem (abbreviated to show
> highlights here):
>
> $ mkdir -p a/b
> $ strace rm -f a
> ...
> openat(AT_FDCWD, "a", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 3
> ...
> fcntl(3, F_DUPFD, 3) = 4
> ...
> close(3) = 0
> ...
> openat(4, "b", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 3
> ...
> fcntl(3, F_DUPFD, 3) = 5
> ...
> close(3) = 0
> close(5) = 0
> unlinkat(4, "b", AT_REMOVEDIR) = 0
> unlinkat(AT_FDCWD, "a", AT_REMOVEDIR) = 0
> close(4) = 0
>
> Notice that for subdirectories, we opened the directory, then used dup
> to have a handle for use in further *at calls, then do
> fdopendir/readdir/closedir on the DIR*, then close the duplicate fd,
> all before calling unlinkat (aka rmdir) on that subdirectory. But for
> the top-level directory, the dup'd fd (4) is still open when we
> attempt the unlinkat.
Thanks for the analysis, Eric.
That was due to a rather subtle but easy/safe-to-fix bug.
While the rm from coreutils-8.14 worked as your strace above shows, the
fixed one does this: (note how the close(4) now precedes the removal of "a")
$ mkdir -p a/b
$ strace -e openat,close,unlinkat ./rm -rf a
close(3) = 0
close(3) = 0
openat(AT_FDCWD, "a", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 3
close(3) = 0
openat(4, "b", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_NOFOLLOW) = 3
close(3) = 0
close(5) = 0
unlinkat(4, "b", AT_REMOVEDIR) = 0
close(4) = 0
unlinkat(AT_FDCWD, "a", AT_REMOVEDIR) = 0
close(0) = 0
close(1) = 0
close(2) = 0
Here is the patch that I expect to push tomorrow:
From a11c49cd72a91c05a272e36ff5d3cd92675cbfb5 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering AT redhat DOT com>
Date: Sun, 23 Oct 2011 22:42:25 +0200
Subject: [PATCH] fts: close parent dir FD before returning from
post-traversal fts_read
The problem: the fts-using "rm -rf A/B/" would attempt to unlink A,
while a file descriptor open on A remained. This is suboptimal
(holding a file descriptor open longer than needed) on Linux, but
otherwise not a problem. However, on Cygwin with certain file system
types, (see http://cygwin.com/ml/cygwin/2011-10/msg00365.html), that
represents a real problem: it causes the removal of A to fail with
e.g., "rm: cannot remove `A': Device or resource busy"
fts visits each directory twice and keeps a cache (fts_fd_ring) of
directory file descriptors. After completing the final, FTS_DP,
visit of a directory, RESTORE_INITIAL_CWD intended to clear the FD
cache, but then proceeded to add a new FD to it via the subsequent
FCHDIR (which calls cwd_advance_fd and i_ring_push). Before, the
final file descriptor would be closed only via fts_close's call to
fd_ring_clear. Now, it is usually closed earlier, via the final
FTS_DP-returning fts_read call.
* lib/fts.c (restore_initial_cwd): New function, converted from
the macro. Call fd_ring_clear *after* FCHDIR, not before it.
Update callers.
Reported by Franz Sirl via the above URL, with analysis by Eric Blake
in http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/28739
---
ChangeLog | 25 +++++++++++++++++++++++++
lib/fts.c | 23 +++++++++++++++--------
2 files changed, 40 insertions(+), 8 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 93ee45e..3a2d2cc 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,28 @@
+2011-10-23 Jim Meyering <meyering AT redhat DOT com>
+
+ fts: close parent dir FD before returning from post-traversal fts_read
+ The problem: the fts-using "rm -rf A/B/" would attempt to unlink A,
+ while a file descriptor open on A remained. This is suboptimal
+ (holding a file descriptor open longer than needed) on Linux, but
+ otherwise not a problem. However, on Cygwin with certain file system
+ types, (see http://cygwin.com/ml/cygwin/2011-10/msg00365.html), that
+ represents a real problem: it causes the removal of A to fail with
+ e.g., "rm: cannot remove `A': Device or resource busy"
+
+ fts visits each directory twice and keeps a cache (fts_fd_ring) of
+ directory file descriptors. After completing the final, FTS_DP,
+ visit of a directory, RESTORE_INITIAL_CWD intended to clear the FD
+ cache, but then proceeded to add a new FD to it via the subsequent
+ FCHDIR (which calls cwd_advance_fd and i_ring_push). Before, the
+ final file descriptor would be closed only via fts_close's call to
+ fd_ring_clear. Now, it is usually closed earlier, via the final
+ FTS_DP-returning fts_read call.
+ * lib/fts.c (restore_initial_cwd): New function, converted from
+ the macro. Call fd_ring_clear *after* FCHDIR, not before it.
+ Update callers.
+ Reported by Franz Sirl via the above URL, with analysis by Eric Blake
+ in http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/28739
+
2011-10-23 Gary V. Vaughan <gary AT gnu DOT org>
Bruno Haible <bruno AT clisp DOT org>
Jim Meyering <jim AT meyering DOT net>
diff --git a/lib/fts.c b/lib/fts.c
index e3829f3..f61a91e 100644
--- a/lib/fts.c
+++ b/lib/fts.c
@@ -229,11 +229,6 @@ static int fts_safe_changedir (FTS *, FTSENT *, int, const char *)
#define ISSET(opt) (sp->fts_options & (opt))
#define SET(opt) (sp->fts_options |= (opt))
-/* FIXME: make this a function */
-#define RESTORE_INITIAL_CWD(sp) \
- (fd_ring_clear (&((sp)->fts_fd_ring)), \
- FCHDIR ((sp), (ISSET (FTS_CWDFD) ? AT_FDCWD : (sp)->fts_rfd)))
-
/* FIXME: FTS_NOCHDIR is now misnamed.
Call it FTS_USE_FULL_RELATIVE_FILE_NAMES instead. */
#define FCHDIR(sp, fd) \
@@ -349,6 +344,18 @@ cwd_advance_fd (FTS *sp, int fd, bool chdir_down_one)
sp->fts_cwd_fd = fd;
}
+/* Restore the initial, pre-traversal, "working directory".
+ In FTS_CWDFD mode, we merely call cwd_advance_fd, otherwise,
+ we may actually change the working directory.
+ Return 0 upon success. Upon failure, set errno and return nonzero. */
+static int
+restore_initial_cwd (FTS *sp)
+{
+ int fail = FCHDIR (sp, ISSET (FTS_CWDFD) ? AT_FDCWD : sp->fts_rfd);
+ fd_ring_clear (&(sp->fts_fd_ring));
+ return fail;
+}
+
/* Open the directory DIR if possible, and return a file
descriptor. Return -1 and set errno on failure. It doesn't matter
whether the file descriptor has read or write access. */
@@ -948,7 +955,7 @@ next: tmp = p;
* root.
*/
if (p->fts_level == FTS_ROOTLEVEL) {
- if (RESTORE_INITIAL_CWD(sp)) {
+ if (restore_initial_cwd(sp)) {
SET(FTS_STOP);
return (NULL);
}
@@ -1055,7 +1062,7 @@ cd_dot_dot:
* one level, via "..".
*/
if (p->fts_level == FTS_ROOTLEVEL) {
- if (RESTORE_INITIAL_CWD(sp)) {
+ if (restore_initial_cwd(sp)) {
p->fts_errno = errno;
SET(FTS_STOP);
}
@@ -1579,7 +1586,7 @@ mem1: saved_errno = errno;
*/
if (!continue_readdir && descend && (type == BCHILD || !nitems) &&
(cur->fts_level == FTS_ROOTLEVEL
- ? RESTORE_INITIAL_CWD(sp)
+ ? restore_initial_cwd(sp)
: fts_safe_changedir(sp, cur->fts_parent, -1, ".."))) {
cur->fts_info = FTS_ERR;
SET(FTS_STOP);
--
1.7.7.419.g87009
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
- Raw text -