利用python爬取点小图片,满足私欲(爬虫)
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了利用python爬取点小图片,满足私欲(爬虫)相关的知识,希望对你有一定的参考价值。
import requests
import re
import os,sys
links=[]
titles=[]
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"}
def get_url(page):
url=‘http://www.zbjuran.com/mei/xinggan/list_13_%s.html‘%(page)
data=requests.get(url,headers=headers).text
data_use=re.findall(‘<div class="name"><a target="_blank" href=".*?" title=".*?</a></div>‘,data)
for use in data_use:
link=‘http://www.zbjuran.com/‘+use.split(‘href="‘)[1].split(‘" title‘)[0]
links.append(link)
title=use.split(‘title="‘)[1].split(‘">‘)[0]
titles.append(title)
mkpath=‘/Users/b1ancheng/mzpc/%s‘%title
wtxtpath=‘/Users/b1ancheng/mzpc/%s/%s.txt‘ % (title, title)
def get_pic():
url_data=requests.get(link).text
page = int(url_data.split(‘<div class="page"><li><a>共‘)[1].split(‘页:‘)[0])
for i in range(1, page + 1):
print(‘正在下载第%s页‘%i)
pic_url = (link[:-5] + ‘_%s‘ + link[-5:])%i
print(pic_url)
try:
pic_data_link=‘http://www.zbjuran.com‘+requests.get(pic_url,headers=headers,timeout=5).text.split(‘<img src="‘)[1].split(‘" /></div>‘)[0]
with open(‘/Users/b1ancheng/mzpc/%s/%s_%s.JPG‘ % (title, title,i),‘wb‘) as pic_download:
pic_download.write(requests.get(pic_data_link).content)
except Exception as error:
print(error)
continue
# 创建目录
isExists = os.path.exists(mkpath)
if not isExists:
os.makedirs(mkpath)
get_pic()
else:
return False
if __name__ == ‘__main__‘:
for page in range(1,88):
get_url(page)
以上是关于利用python爬取点小图片,满足私欲(爬虫)的主要内容,如果未能解决你的问题,请参考以下文章
如何利用Python网络爬虫爬取微信朋友圈动态--附代码(下)