猫眼 top_100 爬取 ___只完成了第一页

Posted skyda

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了猫眼 top_100 爬取 ___只完成了第一页相关的知识,希望对你有一定的参考价值。

# python 3.7
from urllib.request import Request,urlopen
import time,re,csv

class Maoyan(object):
    def __init__(self):
        self.header = {
        Connection: keep - alive,
            Cookie: uuid_n_v=v1; uuid=16B52300EED311E8A50EC9D5D894D382A1072CB6CA3D4BAA95D7EA39B1BB3637; _lxsdk_cuid=1673eb37e1fc8-011175d5446e19-424f0928-13c680-1673eb37e20c8; _lxsdk=16B52300EED311E8A50EC9D5D894D382A1072CB6CA3D4BAA95D7EA39B1BB3637; _csrf=6597fe121a59ff12f8bf1b793cb7d29274a118e066c86f8bf88b8e765b7d4dad; _lx_utm=utm_source%3DBaidu%26utm_medium%3Dorganic; __mta=145127947.1542945209936.1542945209936.1542954826219.2; _lxsdk_s=1673f4639ac-357-82a-15d%7C%7C4,
            Host: maoyan.com,
            Referer: http://maoyan.com/board,
            Upgrade - Insecure - Requests: 1,
            User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (Khtml, like Gecko) Chrome/70.0.3538.102 Safari/537.36

        }


    def get_page(self,url):
        res = urlopen(Request(url =url,headers=self.header)).read()
        self.parsePage(res.decode())

    def parsePage(self,res):
        patten = data-val="{.*?}">(.*?)</a></p>s+<p class="star">s+(.*?)s+</p>s+<p class="releasetime">(.*?)</p>
        a = re.findall(patten,res)
        self.write(a)

    def write(self,a):
        for i in a:
            with open(11.csv,a+,newline=‘‘,encoding=gbk) as f:
                a = csv.writer(f)
                a.writerow(list(i))

    def wordon(self):
        pass

if __name__ == __main__:
    a = Maoyan()
    a.get_page(http://maoyan.com/board/4?offset=0)

 

以上是关于猫眼 top_100 爬取 ___只完成了第一页的主要内容,如果未能解决你的问题,请参考以下文章

爬虫练习 | 爬取猫眼电影Top100

Requests+正则爬取猫眼电影TOP100

20170513爬取猫眼电影Top100

00_抓取猫眼电影排行TOP100

python爬取猫眼电影的Top100

PySpider 抓取猫眼电影TOP100