python 爬虫学习第二课
Posted helenandyoyo
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python 爬虫学习第二课相关的知识,希望对你有一定的参考价值。
python 爬虫学习之requests模块
request响应内容
#request响应内容
import requests
response = requests.get('http://www.baidu.com')
print(type(response))
print(response.status_code)
print(type(response.text))
print(response.text)
print(response.cookies)
print(response.content)
print(response.content.decode('utf-8'))
request post请求
#request post请求
import requests
data =
"name":"sun",
"age":"25"
response = requests.get('http://www.baidu.com', params=data)
print(response.url)
print(response.content.decode('utf-8'))
状态码
#状态码
import requests
response = requests.get('http://www.baidu.com')
if response.status_code == requests.codes.ok:
print('访问成功')
文件上传
#文件上传
import requests
files = "files" : open("test.jpg","rb")
response = requests.post('http://httpbin.org/post', files= files)
print(response.text)
获取cookie
#获取cookie
import requests
response = requests.get('http://www.baidu.com')
print(response.cookies)
for key, value in response.cookies.items():
print(key+ "=" + value)
会话维持
#会话维持
import requests
s = requests.Session()
s.get('http://httpbin.org/cookies/set/number/123456')
response = s.get('http://httpbin.org/cookies')
print(response.text)
解析json
#解析json
import requests
import json
response = requests.get('http://httpbin.org/get')
print(type(response.text))
print(response.json())
print(json.loads(response.text))
print(type(response.json()))
添加header
#添加header
import requests
headers =
"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/537.36 (Khtml, like Gecko) Chrome/58.0.3029.110 Safari/537.36"
response = requests.get('https://www.zhihu.com', headers= headers)
print(response.text)
认证设置一
#认证设置一
import requests
from requests.auth import HTTPBasicAuth
response = requests.get('http://120.27.34.24:9001/', auth = HTTPBasicAuth("user","123"))
print(response.status_code)
认证设置二
#认证设置二
import requests
response = requests.get('http://120.27.34.24:9001/',auth =("user","123"))
print(response.status_code)
异常处理
#异常处理
import requests
from requests.exceptions import Timeout, ConnectionError, RequestException
try:
response = requests.get('http://httpbin.org/get', timeout= 0.1)
print(response.status_code)
except Timeout:
print("timeout!")
except ConnectionError:
print('connection error!')
except RequestException:
print('error')
注:学习资料来源:https://www.cnblogs.com/zhaof/p/6915127.html
以上是关于python 爬虫学习第二课的主要内容,如果未能解决你的问题,请参考以下文章