Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <434CCFDE.7000409@holdenweb.com> Date: Wed, 12 Oct 2005 09:57:02 +0100 From: Steve Holden User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317) MIME-Version: 1.0 Newsgroups: gmane.comp.python.general To: johnnyandfiona AT hotmail DOT com CC: cygwin AT cygwin DOT com Subject: Re: A problem while using urllib References: <1129024641 DOT 113182 DOT 74420 AT g49g2000cwa DOT googlegroups DOT com> <1h49slu.1gaj0nnmdk5ghN%aleax AT mail DOT comcast DOT net> <1129089078 DOT 852454 DOT 153800 AT f14g2000cwb DOT googlegroups DOT com> <1129102671 DOT 990994 DOT 43590 AT f14g2000cwb DOT googlegroups DOT com> <434CC861 DOT 2080805 AT holdenweb DOT com> In-Reply-To: <434CC861.2080805@holdenweb.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Note-from-DJ: This may be spam Steve Holden wrote: > Johnny Lee wrote: > [...] > >>I've sent the source, thanks for your help. >> > > [...] > Preliminary result, in case this rings bells with people who use urllib2 > quite a lot. I modified the error case to report the actual message > returned with the exception and I'm seeing things like: > > http://www.holdenweb.com/./Python/webframeworks.html > Message: > Start process > http://www.amazon.com/exec/obidos/ASIN/0596001886/steveholden-20 > Error: IOError while parsing > http://www.amazon.com/exec/obidos/ASIN/0596001886/steveholden-20 > Message: > . > . > . > > So at least we know now what the error is, and it looks like some sort > of resource limit (though why only on Cygwin betas me) ... anyone, > before I start some serious debugging? > I realized after this post that WingIDE doesn't run under Cygwin, so I modified the code further to raise an error and give us a proper traceback. I also tested the program under the standard Windows 2.4.1 release, where it didn't fail, so I conclude you have unearthed a Cygwin socket bug. Here's the traceback: End process http://www.holdenweb.com/contact.html Start process http://freshmeat.net/releases/192449 Error: IOError while parsing http://freshmeat.net/releases/192449 Message: Traceback (most recent call last): File "Spider_bug.py", line 225, in ? spider.run() File "Spider_bug.py", line 143, in run self.grabUrl(tempUrl) File "Spider_bug.py", line 166, in grabUrl webPage = urllib2.urlopen(url).read() File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen return _opener.open(url, data) File "/usr/lib/python2.4/urllib2.py", line 358, in open response = self._open(req, data) File "/usr/lib/python2.4/urllib2.py", line 376, in _open '_open', req) File "/usr/lib/python2.4/urllib2.py", line 337, in _call_chain result = func(*args) File "/usr/lib/python2.4/urllib2.py", line 1021, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib/python2.4/urllib2.py", line 996, in do_open raise URLError(err) urllib2.URLError: Looking at that part of the course of urrllib2 we see: headers["Connection"] = "close" try: h.request(req.get_method(), req.get_selector(), req.data, headers) r = h.getresponse() except socket.error, err: # XXX what error? raise URLError(err) So my conclusion is that there's something in the Cygwin socket module that causes problems not seen under other platforms. I couldn't find any obviously-related error in the Python bug tracker, and I have copied this message to the Cygwin list in case someone there knows what the problem is. Before making any kind of bug submission you should really see if you can build a program shorter that the existing 220+ lines to demonstrate the bug, but it does look to me like your program should work (as indeed it does on other platforms). regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/