scrapy设置ip池问题

Posted 自说自话唉

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了scrapy设置ip池问题相关的知识,希望对你有一定的参考价值。

middlewares.py

import random, base64

class ProxyMiddleware(object):

    proxyList = [61.129.70.131 , 120.204.85.29]

    def process_request(self, request, spider):
        # Set the location of the proxy
        pro_adr = random.choice(self.proxyList)
        print("USE PROXY -> "+pro_adr)
        request.meta[proxy] = "http://"+ pro_adr

setting.py

DOWNLOADER_MIDDLEWARES = {
#    ceshisc.middlewares.CeshiscDownloaderMiddleware: 543,
# scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware:123,
ceshisc.middlewares.ProxyMiddleware: 100,
scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware: 110
}
ITEM_PIPELINES = {
   ceshisc.pipelines.CeshiscPipeline: 300,
}

小蜘蛛代码

import scrapy

class DmozSpider(scrapy.Spider):
    name = "demo"
    allowed_domains = ["baidu.com"]
    start_urls = [http://www.baidu.com/]
    

    def parse(self, response):
        print("进来了...........数据")

 

以上是关于scrapy设置ip池问题的主要内容,如果未能解决你的问题,请参考以下文章

scrapy按顺序启动多个爬虫代码片段(python3)

Scrapy框架设置UA池与代理池 񬪨

Scrapy框架设置UA池与代理池 -- 2019-08-08 17:20:36

Scrapy框架设置UA池与代理池 -- 2019-08-08 18:00:10

Python3爬虫Scrapy使用IP代理池和随机User-Agent

UA池和代理IP池