python3 requests模块

Posted lilyxiaoyy

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python3 requests模块相关的知识,希望对你有一定的参考价值。

# coding:gbk
import requests

response = requests.get(http://www.sina.com.cn/)
print(response)
print(response.status_code)  # 200正常,404找不到网页,503等5开头的是人家网站内部错误
print(response.content)

 

爬虫实例

import re
import requests
from multiprocessing import Pool

def get_page(url, pattern):
    response = requests.get(url)
    if response.status_code == 200:
        return (response.text, pattern)

def parse_page(info):
    page_content, pattern = info
    res = re.findall(pattern, page_content)
    for item in res:
        dic = 
            index: item[0],
            title: item[1],
            actor: item[2].strip()[3:],
            time: item[3][5:],
            score: item[4]+item[5]

        
        print(dic)
if __name__ == __main__:
    pattern1 = re.compile(r<dd>.*?board-index.*?>(\d+)<.*?title="(.*?)".*?star.*?>(.*?)<.*?releasetime.*?>(.*?)
                        r<.*?integer.*?>(.*?)<.*?fraction.*?>(.*?)<, re.S)

    url_dic = 
        http://maoyan.com/board/7: pattern1,
    

    p = Pool()
    for url, pattern in url_dic.items():
        res = p.apply_async(get_page, args=(url, pattern), callback=parse_page)

    p.close()
    p.join()


# ‘index‘: ‘1‘, ‘time‘: ‘2019-05-16‘, ‘title‘: ‘海蒂和爷爷‘, ‘actor‘: ‘阿努克·斯特芬,布鲁诺·甘茨,昆林·艾格匹‘, ‘score‘: ‘9.5‘
# ‘index‘: ‘2‘, ‘time‘: ‘2019-05-31‘, ‘title‘: ‘尺八·一声一世‘, ‘actor‘: ‘佐藤康夫,小凑昭尚,蔡鸿文‘, ‘score‘: ‘9.4‘
# ‘index‘: ‘3‘, ‘time‘: ‘2019-06-05‘, ‘title‘: ‘无所不能‘, ‘actor‘: ‘赫里尼克·罗斯汉,亚米·高塔姆,洛尼特·罗伊‘, ‘score‘: ‘9.3‘
# ‘index‘: ‘4‘, ‘time‘: ‘2019-04-29‘, ‘title‘: ‘何以为家‘, ‘actor‘: ‘赞恩·阿尔·拉菲亚,约丹诺斯·希费罗,博鲁瓦蒂夫·特雷杰·班科尔‘, ‘score‘: ‘9.3‘
# ‘index‘: ‘5‘, ‘time‘: ‘2019-05-17‘, ‘title‘: ‘一条狗的使命2‘, ‘actor‘: ‘丹尼斯·奎德,凯瑟琳·普雷斯科特,刘宪华‘, ‘score‘: ‘9.2‘
# ‘index‘: ‘6‘, ‘time‘: ‘2019-05-10‘, ‘title‘: ‘一个母亲的复仇‘, ‘actor‘: ‘希里黛玉,阿克夏耶·坎纳,萨佳·阿里‘, ‘score‘: ‘9.2‘
# ‘index‘: ‘7‘, ‘time‘: ‘2019-05-24‘, ‘title‘: ‘龙珠超:布罗利‘, ‘actor‘: ‘野泽雅子,堀川亮,中尾隆圣‘, ‘score‘: ‘9.2‘
# ‘index‘: ‘8‘, ‘time‘: ‘2019-05-01‘, ‘title‘: ‘港珠澳大桥‘, ‘actor‘: ‘2288;‘, ‘score‘: ‘9.2‘
# ‘index‘: ‘9‘, ‘time‘: ‘2019-05-17‘, ‘title‘: ‘音乐家‘, ‘actor‘: ‘胡军,袁泉,别里克·艾特占诺夫‘, ‘score‘: ‘9.1‘
# ‘index‘: ‘10‘, ‘time‘: ‘2019-05-24‘, ‘title‘: ‘阿拉丁‘, ‘actor‘: ‘梅纳·玛索德,娜奥米·斯科特,马尔万·肯扎里‘, ‘score‘: ‘9.0‘

 

以上是关于python3 requests模块的主要内容,如果未能解决你的问题,请参考以下文章

python3.x requests 模块使用

python3 爬虫之requests模块使用总结

Python3中的requests模块怎样用?

[实战演练]python3使用requests模块爬取页面内容

[实战演练]python3使用requests模块爬取页面内容

python3:requests模块-写了一点