爬虫-爬取豆瓣图书TOP250

Posted hiss

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了爬虫-爬取豆瓣图书TOP250相关的知识,希望对你有一定的参考价值。

import requests
from bs4 import BeautifulSoup

def get_book(url):
    wb_data = requests.get(url)
    soup = BeautifulSoup(wb_data.text,lxml)
    title_list = soup.select(h1 > span)
    title = title_list[0].text
    author_list = soup.select(div#info > a)
    author = author_list[0].text.replace(" ", "").replace("\n", "")
    score_list = soup.select(strong.ll.rating_num)
    score = score_list[0].text

    data = {
        title:title,
        score:score,
        author:author,
    }

    print(data)


def get_all_book():
    for i in range(0,250,25):
        url = https://book.douban.com/top250?start= + str(i)
        wb_data = requests.get(url)
        soup = BeautifulSoup(wb_data.text,lxml)
        href_list = soup.select(div.pl2 > a)
        for href in href_list:
            link = href.get(href)
            get_book(link)

get_all_book()

 

以上是关于爬虫-爬取豆瓣图书TOP250的主要内容,如果未能解决你的问题,请参考以下文章

爬虫爬取豆瓣图书TOP250

爬取豆瓣网图书TOP250的信息

爬虫之爬取豆瓣热门图书的名字

python爬虫入门爬取豆瓣电影top250

Python 2.7_利用xpath语法爬取豆瓣图书top250信息_20170129

爬取豆瓣电影Top250和和豆瓣图书