学习笔记 requests + BeautifulSoup

Posted 7749ha

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了学习笔记 requests + BeautifulSoup相关的知识,希望对你有一定的参考价值。

第一步:requests

get请求

# -*- coding:utf-8  -*-
# 日期:2018/5/15 17:46
# Author:小鼠标
import requests
url = "http://www.baidu.com"
#res = requests.get(url)  #方法1

res = requests.request(‘get‘,url) #方法2
print(响应状态码:,res.status_code) print(响应内容:,res.text)

post请求

# -*- coding:utf-8  -*-
# 日期:2018/5/15 17:46
# Author:小鼠标
import requests
url = "http://www.baidu.com"
data = {
    username: xiaoshubiao,
    pwd: xiaoshubiao
}
res = requests.post(url,data)
print(响应状态码:,res.status_code)
print(响应内容:,res.text)

第二步:伪装浏览器和伪造cookie

# -*- coding:utf-8  -*-
# 日期:2018/5/15 17:46
# Author:小鼠标
import requests
url = "http://www.baidu.com"
headers = {User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36
                          (Khtml, like Gecko) Chrome/55.0.2883.87 UBrowser/6.2.39
                         64.2 Safari/537.36,
            Accept: text/html,application/xhtml+xml,application/xml;q=0
                      .9,image/webp,*/*;q=0.8,
            Accept-Encoding: gzip, deflate, sdch,
            Accept-Language: zh-CN,zh;q=0.8,en;q=0.6,
            Cache-Control: max-age=0,
            Connection: keep-alive
          }
cookies = dict(name=xiaoshubiao)
res = requests.get(url,headers = headers,cookies = cookies)
print(响应状态码:,res.status_code)
print(响应内容:,res.text)

第三步:使用代理ip

# -*- coding:utf-8  -*-
# 日期:2018/5/15 17:46
# Author:小鼠标
import requests
url = "http://www.baidu.com"
headers = {User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36
                          (KHTML, like Gecko) Chrome/55.0.2883.87 UBrowser/6.2.39
                         64.2 Safari/537.36,
            Accept: text/html,application/xhtml+xml,application/xml;q=0
                      .9,image/webp,*/*;q=0.8,
            Accept-Encoding: gzip, deflate, sdch,
            Accept-Language: zh-CN,zh;q=0.8,en;q=0.6,
            Cache-Control: max-age=0,
            Connection: keep-alive
            }
cookies = dict(name=xiaoshubiao)
proxies = {http:218.73.134.234:36602}
res = requests.get(url,headers = headers,cookies = cookies,proxies = proxies)
print(响应状态码:,res.status_code)
print(响应内容:,res.text)

 

 

 

以上是关于学习笔记 requests + BeautifulSoup的主要内容,如果未能解决你的问题,请参考以下文章

Python爬虫学习笔记.Beautiful Soup库的使用

Python爬虫学习笔记.Beautiful Soup库的使用

爬虫笔记3

爬虫 requests 和 beautiful soup 提取内容

基于requests 和 Beautiful的7160美图网爬取图片

详解Python 采用 requests + Beautiful Soup 爬取房天下新楼盘推荐