selenium:socket.error: [Errno 61] 连接被拒绝

Posted

技术标签:

【中文标题】selenium:socket.error: [Errno 61] 连接被拒绝【英文标题】:selenium:socket.error: [Errno 61] Connection refused 【发布时间】:2014-09-28 20:58:40 【问题描述】:

我想捕捉 10 个链接 当我运行蜘蛛时,我可以得到 json 文件中的链接,但仍然有这样的错误: 似乎 selenium 运行了两次。问题是什么? 请指导我谢谢你

2014-08-06 10:30:26+0800 [spider2] DEBUG: Scraped from <200 http://www.test/a/1>
'link': u'http://www.test/a/1'
2014-08-06 10:30:26+0800 [spider2] ERROR: Spider error processing <GET
http://www.test/a/1>
Traceback (most recent call last):
 ........
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 571, in create_connection
    raise err
socket.error: [Errno 61] Connection refused

这是我的代码:

from selenium import webdriver
from scrapy.spider import Spider
from ta.items import TaItem
from selenium.webdriver.support.wait import WebDriverWait
from scrapy.http.request import Request

class ProductSpider(Spider):
    name = "spider2"  
    start_urls = ['http://www.test.com/']
    def __init__(self):
        self.driver = webdriver.Firefox()

    def parse(self, response):
        self.driver.get(response.url)
        self.driver.implicitly_wait(20)  
        next = self.driver.find_elements_by_css_selector("div.body .heading a")
        for a in next:
            item = TaItem()    
            item['link'] =  a.get_attribute("href")     
            yield Request(url=item['link'], meta='item': item, callback=self.parse_detail)  

    def parse_detail(self,response):
        item = response.meta['item']
        yield item
        self.driver.close()

【问题讨论】:

你为什么在parse_detail()中调用self.driver.close() 我删除了那行。错误消失了。是否意味着我不必关闭它或者我应该在哪里关闭它? 【参考方案1】:

问题是您过早关闭驱动程序。

只有在蜘蛛完成它的工作时才应该关闭它,听spider_closed信号:

from scrapy import signals
from scrapy.xlib.pydispatch import dispatcher
from selenium import webdriver
from scrapy.spider import Spider
from ta.items import TaItem
from scrapy.http.request import Request


class ProductSpider(Spider):
    name = "spider2"  
    start_urls = ['http://www.test.com/']
    def __init__(self):
        self.driver = webdriver.Firefox()
        dispatcher.connect(self.spider_closed, signals.spider_closed)

    def parse(self, response):
        self.driver.get(response.url)
        self.driver.implicitly_wait(20)  
        next = self.driver.find_elements_by_css_selector("div.body .heading a")
        for a in next:
            item = TaItem()    
            item['link'] =  a.get_attribute("href")     
            yield Request(url=item['link'], meta='item': item, callback=self.parse_detail)  

    def parse_detail(self,response):
        item = response.meta['item']
        yield item

    def spider_closed(self, spider):
        self.driver.close()

另请参阅:scrapy: Call a function when a spider quits。

【讨论】:

以上是关于selenium:socket.error: [Errno 61] 连接被拒绝的主要内容,如果未能解决你的问题,请参考以下文章

make 模板

理解 e.clientX,e.clientY e.pageX e.pageY e.offsetX e.offsetY

理解 e.clientX,e.clientY e.pageX e.pageY e.offsetX e.offsetY

理解 e.clientX,e.clientY e.pageX e.pageY e.offsetX e.offsetY

理解 e.clientX,e.clientY e.pageX e.pageY e.offsetX e.offsetY

如何提取e人e本手写办公系统软件