Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-Id: <5.2.0.9.2.20030213190807.03359eb0@pop3.cris.com> X-Sender: rrschulz AT pop3 DOT cris DOT com Date: Thu, 13 Feb 2003 20:19:41 -0800 To: cygwin AT cygwin DOT com From: Randall R Schulz Subject: Re: Wget ignores robot.txt entry In-Reply-To: <034901c2d3d5$bf3e2150$78d96f83@pomello> References: <5 DOT 2 DOT 0 DOT 9 DOT 2 DOT 20030213182750 DOT 01e97e98 AT pop3 DOT cris DOT com> <5 DOT 2 DOT 0 DOT 9 DOT 2 DOT 20030213185143 DOT 01da0ef0 AT pop3 DOT cris DOT com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Max, No, I don't think cURL does recursive retrieval. I don't think it does Web page dependency retrieval, either. Both of these are a big deal for me. How could a tool of wget's versatility be replaced by something inferior? Whatever happened to technological meritocracy? (Please, no laughing.) I was actually hoping to get some time to work on an extension to wget of my own. I wanted to add an option that would cause wget to look in one hierarchy to determine file existence and modification times relative to the set of files and mod times on the server and download new or newer files to a different location. That way I can easily maintain mirror copies on a CD-ROM. I'd tell wget to use the CD's contents as the file and mod-time reference and to download to a location on my hard drive (of course). Then I could incrementally update the ROM with whatever was downloaded. Of course I can still do that and I may yet. Does that sound like a desirable feature to anyone? I don't know how many people share my mania for keeping local archives of content from the Internet. What happens to an open source project when it devolves to this state? Who, for example, could hand out writable access to the wget CVS repository? Surely this isn't an unrecoverable state of affairs, is it? Randall Schulz At 19:04 2003-02-13, Max Bowsher wrote: >Randall R Schulz wrote: > > Wget is orphaned? That's bad news, since it seems to have it all over > > cURL. (Sure. Go ahead and prove me wrong. I might as well get it over > > with... for now.) > >cURL doesn't do recursive web-suck (does it?) > >Yes, wget is orphaned. There's no one on the wget mailing list who has CVS >write access. Which is a great shame, as there are a surprising amount of >patches being sent in. > > >Max. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/