Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <3E4C511E.9060800@serv.net> Date: Thu, 13 Feb 2003 18:14:54 -0800 From: L Anderson Organization: TBD User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.0.2) Gecko/20021120 Netscape/7.01 X-Accept-Language: en,ru MIME-Version: 1.0 To: cygwinList Subject: Wget ignores robot.txt entry Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Using the latest of things Cygwin, I downloaded some stuff with wget from to peruse off-line and noticed a problem I can't explain: The file has the entries: User-agent: * Disallow: /snapshots/ Disallow: /cgi-bin/ Disallow: /cgi2-bin/ so wget should not download /cgi-bin/. However, "wget -o cygwincom.log -m -p --no-parent -X /cygwin,/ml http://cygwin.com/" downloads /cgi-bin anyway. NB. "wget -o cygwincom.log -m -p --no-parent -X /cgi-bin,/cygwin,/ml http://cygwin.com/ doesn't download /cgi-bin I ran a validity check on and found no errors. Is this a bug in wget or am I doing something wrong? Thanks, Lowell Anderson -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/