爬取妹子图

Posted 2020-10-01 月下柳梢映

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了爬取妹子图相关的知识，希望对你有一定的参考价值。

这是之前写的一个简单爬取妹纸图的爬虫，下面是源代码：

# -*- coding: utf-8 -*-

import requests,time,urllib.request,os

from multiprocessing import Process
from lxml import etree

#os.chdir("meizhu")切换工作目录
print (os.getcwd())#查看当前工作目录

headers = {"User-Agent" : "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0;"}

def use_proxy():
proxy = urllib.request.ProxyHandler({‘http‘:‘proxy_addr‘})
opener = urllib.request.build_opener(proxy,urllib.request.HTTPHandler)\

def respon(imgurl):

req = urllib.request.Request(imgurl,headers=headers)
html = urllib.request.urlopen(req)
response=html.read().decode(‘utf-8‘)
#print(response)
selector = etree.HTML(str(response))

imgs =selector.xpath(‘//div[@class="pic"]/ul/li/a/img/@src‘)

for imgname in imgs:
imgnames = str(imgname.split(‘/‘)[5].split(‘.‘)[0] + ".jpg")
#print(imgnames)

file = urllib.request.urlretrieve(str(imgname), filename=imgnames)
print("爬取妹子图完成！！！！哈哈哈")

if __name__=="__main__":
for i in range(1,100):
imgurl = ‘http://www.mmjpg.com/home/‘+str(i)
respon(imgurl)

以上是关于爬取妹子图的主要内容，如果未能解决你的问题，请参考以下文章