爬虫遇到HTTP Error 403的问题

Posted 2021-12-18 rener0424

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了爬虫遇到HTTP Error 403的问题相关的知识，希望对你有一定的参考价值。

# coding=utf-8


from bs4 import BeautifulSoup
import requests
import urllib
x = 1
y = 1

def crawl(url):
    res = requests.get(url)
    soup = BeautifulSoup(res.text, 'html.parser')
    global y
    with open(f'C:/Users/Administrator/Desktop/alien/pachong/xnt/y.txt','w',encoding="utf-8") as f:
        f.write(str(soup))
        y += 1
    yinhuns = soup.select('img')
    print()
    for yh in yinhuns:
        print(yh)
        link = yh.get('src')
        print(link)
        global x    
        urllib.request.urlretrieve(link, f'C:/Users/Administrator/Desktop/alien/pachong/xnt/x.jpg')
        print(f'正在下载第x张图片')
        x += 1
        
for i in range(1,5):
    url = "https://acg.fi/hentai/23643.htm/" + str(i)
    
    try:
        crawl(url)
    except ValueError as f:
        continue
    except Exception as e:
        print(e)

以上是关于爬虫遇到HTTP Error 403的问题的主要内容，如果未能解决你的问题，请参考以下文章