爬虫大作业
Posted mimimi
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了爬虫大作业相关的知识,希望对你有一定的参考价值。
import requests from bs4 import BeautifulSoup import json import jieba.analyse from PIL import Image import numpy as np import matplotlib.pyplot as plt from wordcloud import WordCloud,ImageColorGenerator url = "https://item.btime.com/36i90hfhkt3838be1gof3cla1ka?from=haozcxw" res = requests.get(url) res.encoding = ‘utf-8‘ soup = BeautifulSoup(res.text,‘html.parser‘) title = soup.select(‘.title‘)[0].text content = soup.select(‘.content-text‘)[0].text info = soup.select(‘.edit-info‘)[0].text au=info[info.find(‘责任编辑:‘):].split()[0].lstrip(‘责任编辑:‘) print(title,content,au) f = open(‘content.txt‘, ‘a‘, encoding=‘utf-8‘) f.write(content) f.close() strl = ‘‘‘,。、‘’ ‘‘‘ for i in strl: ls = content.replace(i," ") print(ls) lyric= ‘‘ f=open(‘content.txt‘,‘r‘, encoding=‘utf-8‘) for i in f: lyric+=f.read() result=jieba.analyse.textrank(lyric,topK=50,withWeight=True) keywords = dict() for i in result: keywords[i[0]]=i[1] print(keywords) image= Image.open(‘t01c9f26bac34842d0d.jpg‘) graph = np.array(image) wc = WordCloud(font_path=‘./fonts/simhei.ttf‘,background_color=‘White‘,max_words=50,mask=graph) wc.generate_from_frequencies(keywords) image_color = ImageColorGenerator(graph) plt.imshow(wc) plt.imshow(wc.recolor(color_func=image_color)) plt.axis("off") plt.show() wc.to_file(‘d.jpg‘)
以上是关于爬虫大作业的主要内容,如果未能解决你的问题,请参考以下文章
HTML5期末大作业:餐饮美食网站设计——咖啡(10页) HTML+CSS+JavaScript 学生DW网页设计作业成品 web课程设计网页规划与设计 咖啡网页设计 美食餐饮网页设计...(代码片段
Python大作业——爬虫+可视化+数据分析+数据库(可视化篇)