X-Spam-Check-By: sourceware.org Date: Tue, 27 Feb 2007 14:58:06 -0500 (EST) From: Igor Peshansky Reply-To: cygwin AT cygwin DOT com To: Phil Edwards cc: Furash Gary , cygwin AT cygwin DOT com, ebb9 AT byu DOT net Subject: Re: Strange message from updatedb In-Reply-To: <82c42b950702271050w2e807388t6009e795f4041d82@mail.gmail.com> Message-ID: References: <82c42b950702261435t4acc4fbctdd2042aee0f609d4 AT mail DOT gmail DOT com> <81BDE334890B7C429EDBA5857465C7565F165F AT mcaosx6 DOT ca DOT mcao DOT maricopa DOT gov> <82c42b950702271050w2e807388t6009e795f4041d82 AT mail DOT gmail DOT com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com On Tue, 27 Feb 2007, Phil Edwards wrote: > On 2/27/07, Furash Gary wrote: . Thanks. > > /cygdrive/c/System\ Volume\ Information > > Quotes and backslashes aren't going to solve the problem, I think. I > looked at updatedb (it's a shell script), and the --prunepaths > argument is passed through a sed script which replaces spaces in order > to turn it all into a regexp. There's no way of telling sed to avoid > some spaces and translate others. That's not quite true. Quoted arguments will be harder, but for backslash-escaped spaces it's reasonably easy. Something like sed -e 's,\\\\,\e,g' -e 's,\([^\\]\) ,\1#,g' -e 's,\e,\\\\,g' replaces all unescaped spaces with '#'s, while preserving escaped spaces and backslashes. The idea for quotes would be similar, except you first have to replace all spaces within matching quotes by some character unlikely to occur in the string (the above assumes that ESC will not be in the string). The sed info page provides an example of a similar approach. > You used to be able to set the internal PRUNEREGEX variable directly, > in a .conf file, but apparently that file is only used under Linux > versions of updatedb, or something. The Cygwin version of updatedb comes from GNU findutils, as do the Linux versions, IIRC. So the behavior should be the same, unless the configure options differed when the packages were built. This is something best answered by the findutils maintainer... > Most lists of dirs are passed around with colon (or some such) > separators to avoid just such problems with paths containing > whitespace. updatedb is still living in the 80's. Well, it's a matter of convention. Colons are legal in filenames on Unix, as is pretty much any character except for NUL. However, many tools treat colons specially, so it's conventionally used as a separator. If you have to pick a character to use as a path separator, a space is as good as any. You'd still need quoting or escape characters to represent the separator. Igor -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_ pechtcha AT cs DOT nyu DOT edu | igor AT watson DOT ibm DOT com ZZZzz /,`.-'`' -. ;-;;,_ Igor Peshansky, Ph.D. (name changed!) |,4- ) )-,_. ,\ ( `'-' old name: Igor Pechtchanski '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! Freedom is just another word for "nothing left to lose"... -- Janis Joplin -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/