对每个请求使用随机用户代理

Posted 2021-02-24

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了对每个请求使用随机用户代理相关的知识，希望对你有一定的参考价值。

# You can use this middleware to have a random user agent every request the spider makes.
# You can define a user USER_AGEN_LIST in your settings and the spider will chose a random user agent from that list every time.
# 
# You will have to disable the default user agent middleware and add this to your settings file.
# 
#     DOWNLOADER_MIDDLEWARES = {
#         'scraper.random_user_agent.RandomUserAgentMiddleware': 400,
#         'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
#     }
 
from scraper.settings import USER_AGENT_LIST
import random
from scrapy import log
 
class RandomUserAgentMiddleware(object):
 
    def process_request(self, request, spider):
        ua  = random.choice(USER_AGENT_LIST)
        if ua:
            request.headers.setdefault('User-Agent', ua)
        #log.msg('>>>> UA %s'%request.headers)
 
# Snippet imported from snippets.scrapy.org (which no longer works)
# author: dushyant
# date  : Sep 16, 2011

以上是关于对每个请求使用随机用户代理的主要内容，如果未能解决你的问题，请参考以下文章