delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2004/09/27/20:34:04

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Message-ID: <4158B157.1070507@serv.net>
Date: Mon, 27 Sep 2004 17:33:27 -0700
From: L Anderson <lowella AT serv DOT net>
User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.7.1) Gecko/20040707
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Wget incorrectly mirroring some web site directories and files.
X-IsSubscribed: yes

When I use 'wget' to mirror some web sites, I notice a peculiar behavior 
I'm unable to find described in the mail archives or with google.  To 
wit(using MSWindows directory notation):

If a web site has something like the following stored on 'host':
	...
	root\a\u.html
	root\a\images\v.jpg
	root\a\images\w.gif
	...
where the web page 'u.html' contains the references:

	'...src="images\v.jpg"...'
	'...src="images\w.gif"...'.

the images 'v.jpg' and 'w.gif' show nicely when browsing the page 
'u.html' at 'host'.  However, mirroring 'host\root' web pages using:

	'wget -m -p -np http://host/root/'

results in 'wget' creating on my machine:
	...
	...\host\root\a\u.html
	...\host\root\a\images%5Cv.jpg
	...\host\root\a\images%5Cw.gif
	...\host\root\a\images\...
	...
rather than
	...
	...\host\root\a\u.html
	...\host\root\a\images\v.jpg
	...\host\root\a\images\w.gif
	...
as expected.  Needless to say, this results in the images 'v.jpg' and 
'w.gif' not displaying when browsing 'u.html' locally because they are 
renamed and then stored in their "parent" directory.

Has anyone else noticed this behavior?

Has it been previously described and I just missed it?

Is it a bug/feature in 'wget'; i.e., 'wget' handles the '\' not as a 
directory thingy but as part of the file name thus converting things 
like 'x\y' to 'x%5Cy' or is it caused by some other part of cygwin?

Also, I seem to remember reading 'wget' as being an orphan--no 
maintainer.  Is this still true and if so, is it because there are 
other/preferred replacements to 'wget'?

Before digging deeper into this problem, I would greatly appreciate any 
insight and information anyone can provide me.

I'm running:	Win98SE
		Wget 1.9.1-1
		Cygwin 1.5.10-cr-0x5e6
		Setup: unix, all users

Thanks,

Lowell Anderson



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019