使?FormRequest.from_response()?法模拟?户登录
通常?站通过 实现对某些表单字段(如数据或是登录界?中的认证令
牌等)的预填充。
使?Scrapy 抓取??时,如果想要预填充或重写像?户名、?户密码这 些
表单字段, 可以使? FormRequest.from_response() ?法实现。
下?是使?这种?法的爬?例?:
import scrapy class LoginSpider(scrapy.Spider): name = ‘example.com‘ start_urls = [‘http://www.example.com/users/login.php‘] def parse(self, response): return scrapy.FormRequest.from_response( response , formdata={‘username‘: ‘john‘, ‘password‘: ‘secret‘}, callback=self.after_login ) def after_login(self, response): # check login succeed before going on if "authentication failed" in response.body: self.log("Login failed", level=log.ERROR) return # continue scraping with authenticated s