爬取汽车之家
Posted di2wu
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了爬取汽车之家相关的知识,希望对你有一定的参考价值。
一、认识requests,beautifulsoup4
soup.find
div.find_all(name=‘li‘)
import requests from bs4 import BeautifulSoup response = requests.get("https://www.autohome.com.cn/news/") response.encoding = ‘gbk‘ soup = BeautifulSoup(response.text,‘html.parser‘) div = soup.find(name=‘div‘,attrs={‘id‘:‘auto-channel-lazyload-article‘}) li_list = div.find_all(name=‘li‘) for li in li_list: title = li.find(name=‘h3‘) if not title: continue p = li.find(name=‘p‘) a = li.find(name=‘a‘) print(title.text) print(a.attrs.get(‘href‘)) print(p.text) img = li.find(name=‘img‘) src = img.get(‘src‘) src = "https:" + src print(src) # 再次发起请求,下载图片 file_name = src.rsplit(‘/‘,maxsplit=1)[1] ret = requests.get(src) with open(file_name,‘wb‘) as f: f.write(ret.content)
以上是关于爬取汽车之家的主要内容,如果未能解决你的问题,请参考以下文章