Requests 库
Posted Python & more
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Requests 库相关的知识,希望对你有一定的参考价值。
Requests 库的两个重要的对象:(Request , Response)
Response对象的属性:
import requests r =requests.get(\'http://www.bilibili.com\') response 对象 <Response [200]> print(r.status_code) # 200状态码-----404错误 print(r.headers) # 请求码 print(r.text) # 字符串形式 print(r.encoding) # 网页的编码方式-根据headers猜测 print(r.apparent_encoding) # 根据内容响应的编码方式(r.encoding=r.apparent_encoding) print(r.content) # 二进制形式
requests 库的7个重要的方法:
=============== requests 库的7个重要的方法 ==============
---1 requests.request(method,url,**kwargs)
---2 requests.get(url,params=None,**kwargs)
---3 requests.head(url,**kwargs)
---4 requests.post(url,data=None,json=None,**kwargs)
---5 requests.put(url,data=None,**kwargs)
---6 requests.patch(url,data=None,**kwargs)
---7 requests.delete(url,**kwargs)
Requests 请求的通用代码框架:
=== 通用框架 === import requests def gethtmlText(url): try: r=requests.get(url,timeout=30) # <Response [200]> r.raise_for_status() # 如果状态码不是200,引发HTTPError异常 r.encoding=r.apparent_encoding return r.text except: return \'Error!\' if __name__==\'__main__\': url=\'http://www.baidu.com\' print(getHTMLText(url))
伪装浏览器请求:
常用请求头: Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)
#opera浏览器 Opera/9.80 (X11; Linux i686; U; ru) Presto/2.8.131 Version/11.11
#chrome浏览器 Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.2 (KHTML, like Gecko) Chrome/22.0.1216.0 Safari/537.2
#firefox浏览器 Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:16.0.1) Gecko/20121011 Firefox/16.0.1
#safri浏览器 Mozilla/5.0 (iPad; CPU OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5355d Safari/8536.25
例子 1 跟换请求头
亚马逊通会过来源审查 阻止访问-----需要更改请求头 import requests url=\'https://www.amazon.cn/dp/B07473FZXY?_encoding=UTF8&ref_=pc_cxrd_658390051_recTab_335775071_t_3&pf_rd_p=7e00fee6-4e12-48f0-b4af-b99068b52067&pf_rd_s=merchandised-search-4&pf_rd_t=101&pf_rd_i=658390051&pf_rd_m=A1AJ19PSB66TGU&pf_rd_r=1SV7P98M8XX2E5N2PBW5&pf_rd_r=1SV7P98M8XX2E5N2PBW5&pf_rd_p=7e00fee6-4e12-48f0-b4af-b99068b52067\' try: header={\'User-Agent\':\'Mozilla/5.0\'} # \'Mozilla/5.0\'---浏览器身份标识 r=requests.get(url,headers=header) r.raise_for_status() r.encoding=r.apparent_encoding print(r.text[1000:2000]) except Exception: print(\'Error!\') print(r.request.url) #当前请求的网页 https://www.amazon.cn/%E5%9B%BE%E4%B9%A6/dp/B07473FZXY print(r.request.headers) # 当前的请求头 {\'User-Agent\': \'python-requests/2.18.2\', \'Accept-Encoding\': \'gzip, deflate\', \'Accept\': \'*/*\', \'Connection\': \'keep-alive\'}
例子 2 关键字搜索--百度 import requests keyword={\'wd\':\'Python\'} url=\'https://www.baidu.com/s\' try: r=requests.get(url,params=keyword) r.raise_for_status() r.encoding=r.apparent_encoding print(r.request.url) # print(len(r.text)) print(r.text) except Exception: print(\'Error!\')
例子 3 图片 爬取 与 保存
import requests import os
url=\'http://pic1.16pic.com/00/14/32/16pic_1432548_b.jpg\' root=r\'C:\\Users\\Administrator\\Desktop\' path=root+\'\\\\\'+url.split(\'/\')[-1]
try: if not os.path.exists(root): # 根目录是否存在 os.mkdir(root) elif not os.path.exists(path): # 文件是否存在 r=requests.get(url) # <Response [200]> # print(r.status_code) # print(r.content) # 二进制
with open(path,\'wb\') as f: f.write( r.content ) # 二进制写入 print(\'文件操作成功!\') else: print(\'文件已经存在!\') except: print(\'Error!\')
例子 5 ip 地址的归属地查询----ip138.com import requests url=\'http://www.ip138.com/ips138.asp?ip=\' ip=\'202.204.80.112\' url=url+ip try: r=requests.get(url) # print(r) # ===回应==== <Response [200]> # print(r.request.url) # ===请求==== http://www.ip138.com/ips138.asp?ip=202.204.80.112 r.raise_for_status() r.encoding=r.apparent_encoding print(r.text[-500:]) except: print(\'Error!\')
以上是关于Requests 库的主要内容,如果未能解决你的问题,请参考以下文章