python|爬虫东宫小说

Posted 2021-02-13 苏苏叶

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了python|爬虫东宫小说相关的知识，希望对你有一定的参考价值。

2k小说网爬取最近大火的《东宫》小说，借鉴之前看过的一段代码，修改之后，进行简单爬取。

from urllib import request
from bs4 import BeautifulSoup
url=‘https://www.fpzw.com/xiaoshuo/19/19210/‘
req=request.Request(url)
response=request.urlopen(req)
html=response.read()
soup=BeautifulSoup(html,‘html.parser‘)
soup_text=soup.find_all(‘dd‘)[4:]
f= open(‘Desktop/donggong.doc‘,‘w‘,encoding=‘utf-8‘)
for link in soup_text:
url2=‘https://www.fpzw.com/xiaoshuo/19/19210/‘+link.a.get(‘href‘)
req2=request.Request(url2)
response2=request.urlopen(req2)
html2=response2.read()
soup2=BeautifulSoup(html2,‘html.parser‘)
soup_text2=soup2.find(‘p‘,class_="Text").text
soup_text3=soup_text2.replace(‘东宫最新章节‘,‘‘)
soup_text3=soup_text3.replace(‘2k小说网欢迎您！本站域名:"2k小说"的完整拼音fpzw.com，很好记哦！www.fpzw.com 好看的小说‘,‘‘)
soup_text3=soup_text3.replace(‘强烈推荐：‘,‘‘)
f.write(soup_text3)
f.write(‘ ‘)
f.close()

爬取的结果没进行精细处理，后续待优化。

以上是关于python|爬虫东宫小说的主要内容，如果未能解决你的问题，请参考以下文章

python爬虫之小说网站--下载小说(正则表达式)

Python爬虫爬取目标小说并保存到本地

python爬虫实战—喜欢下载什么小说，由自己说了算，超详细小说下载爬虫教程

python爬虫之小说爬取

Python爬虫：爬取小说并存储到数据库

Python实战项目网络爬虫之爬取小说吧小说正文