urllib的实现---timeout,获取http响应码,重定向,proxy的设置

Posted 成长日记

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了urllib的实现---timeout,获取http响应码,重定向,proxy的设置相关的知识,希望对你有一定的参考价值。

1.Timeout设置超时

只能修改Socket设置全局Timeout

#! /usr/bin/env python3

import socket

import urllib.request

# timeout in seconds

timeout = 2

socket.setdefaulttimeout(timeout)

# this call to urllib.request.urlopen now uses the default timeout

# we have set in the socket module

req = urllib.request.Request(http://www.python.org/)

a = urllib.request.urlopen(req).read()

print(a)

2.获取HTTP响应码

#! /usr/bin/env python3

import urllib.request

req = urllib.request.Request(http://python.org/)

try:  

  urllib.request.urlopen(req)

except urllib.error.HTTPError as e:

  print(e.code)

print(e.read().decode("utf8"))

3、异常处理1

技术分享图片
 1 #! /usr/bin/env python3
 2 
 3 from urllib.request import Request, urlopen
 4 
 5 from urllib.error import URLError, HTTPError
 6 
 7 req = Request(‘http://www.python.org/‘)
 8 
 9 try:
10 
11   response = urlopen(req)
12 
13 except HTTPError as e:
14 
15   print(‘The (www.python.org)server couldn‘t fulfill the request.‘)
16 
17   print(‘Error code: ‘, e.code)
18 
19 except URLError as e:
20 
21   print(‘We failed to reach a server.‘)
22 
23   print(‘Reason: ‘, e.reason)
24 
25 else:
26 
27   print("good!")
28 
29   print(response.read().decode("utf8")) 
技术分享图片

 

4、异常处理2

技术分享图片
 1 #! /usr/bin/env python3
 2 
 3 from urllib.request import Request, urlopen
 4 
 5 from urllib.error import  URLError
 6 
 7 req = Request("http://www.python.org/")
 8 
 9 try:
10 
11   response = urlopen(req)
12 
13 except URLError as e:
14 
15   if hasattr(e, ‘reason‘):
16 
17     print(‘We failed to reach a server.‘)
18 
19     print(‘Reason: ‘, e.reason)
20 
21   elif hasattr(e, ‘code‘):
22 
23     print(‘The server couldn‘t fulfill the request.‘)
24 
25     print(‘Error code: ‘, e.code)
26 
27 else:  print("good!")
28 
29   print(response.read().decode("utf8"))
技术分享图片

5.重定向

import urllib.request
import socket
url = ‘https://www.baidu.com‘
response =urllib.request.urlopen(url)
isRediercted = response.geturl() == "https://www.baidu.com"


6.代理设置

import urllib.request

proxy_support = urllib.request.ProxyHandler({‘sock5‘: ‘localhost:1080‘})

opener = urllib.request.build_opener(proxy_support)

urllib.request.install_opener(opener)

a = urllib.request.urlopen("http://www.python.org/").read().decode("utf8")

print(a)








以上是关于urllib的实现---timeout,获取http响应码,重定向,proxy的设置的主要内容,如果未能解决你的问题,请参考以下文章

python爬虫入门-urllib模块

urllib的实现---请求响应and请求头处理

python * urllib_urlopen( )

urllib库基本使用

httplib/urllib实现

urllib包