python爬虫小程序

Posted 2020-10-21

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了python爬虫小程序相关的知识，希望对你有一定的参考价值。

爬虫小程序，获取主网页的内容，并获取在该主网页内容下的连接

#coding:utf-8
import re
import requests
url='http://ai.51cto.com/'
con=requests.get(url)
file=open(r'D:\Python27\sevenot_test\curbug3\test.txt','wb')
file.write(con.content)
file.close()
href=re.findall('<a href="(http.*?)"',con.content,re.S)

a=0
for i in href: 
	print str(a)+' '+i
	cc=requests.get(i)
	file_=open(r'D:\Python27\sevenot_test\curbug3\test' + str(a) + '.txt','wb')
	file_.write(cc.content)
	file_.close()
	a+=1

以上是关于python爬虫小程序的主要内容，如果未能解决你的问题，请参考以下文章