爬虫遇到HTTP Error 403的问题
Posted rener0424
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了爬虫遇到HTTP Error 403的问题相关的知识,希望对你有一定的参考价值。
# coding=utf-8
from bs4 import BeautifulSoup
import requests
import urllib
x = 1
y = 1
def crawl(url):
res = requests.get(url)
soup = BeautifulSoup(res.text, 'html.parser')
global y
with open(f'C:/Users/Administrator/Desktop/alien/pachong/xnt/y.txt','w',encoding="utf-8") as f:
f.write(str(soup))
y += 1
yinhuns = soup.select('img')
print()
for yh in yinhuns:
print(yh)
link = yh.get('src')
print(link)
global x
urllib.request.urlretrieve(link, f'C:/Users/Administrator/Desktop/alien/pachong/xnt/x.jpg')
print(f'正在下载第x张图片')
x += 1
for i in range(1,5):
url = "https://acg.fi/hentai/23643.htm/" + str(i)
try:
crawl(url)
except ValueError as f:
continue
except Exception as e:
print(e)
以上是关于爬虫遇到HTTP Error 403的问题的主要内容,如果未能解决你的问题,请参考以下文章
Python爬虫报错:"HTTP Error 403: Forbidden"
urllib2.HTTPError: HTTP Error 403: Forbidden 请高手指点,python菜鸟一枚