使用BeautifulSoup检索图像链接

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了使用BeautifulSoup检索图像链接相关的知识,希望对你有一定的参考价值。

我有以下html代码,我想提取data-img链接。

<a rel="popover" data-img="https://www.johnpyelots.co.uk/Sales/Sale - 
4709/Pictures/1001.jpg" href="lot_details.asp?
l=1&amp;lotid=3357969&amp;pageno=1" class="blacklink2" data-original-
title=""><img border="0" src="https://www.johnpyelots.co.uk/Sales/Sale - 
4709/Thumbnails/thumb_1001.jpg" width="70" <="" a=""></a>

我使用以下Python代码,但我似乎无法提取链接:

urldes = "https://www.johnpyeauctions.co.uk/lot_list.asp?saleid=4709&siteid=1"

# add header
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36'} 
r = requests.get(urldes, headers=headers) 
soup = BeautifulSoup(r.content, "lxml")

mylinks = []
for link in soup.find_all('a'):
    mylinks.append(link['data-image'])

for i in range(len(mylinks)):
    mylinks[i]

mylinks_0 = mylinks[0]

有任何想法吗?

答案

网页上有许多链接都在<a>标签内。因此,如果您只想要具有图像链接的<a>标记,则需要为find_all()方法指定更多参数。

另外,看看List Comprehensions in Python

import requests
from bs4 import BeautifulSoup

r = requests.get('https://www.johnpyeauctions.co.uk/lot_list.asp?saleid=4709&siteid=1')
soup = BeautifulSoup(r.text, 'lxml')
image_links = [x['data-img'] for x in soup.find_all('a', rel='popover')]
for link in image_links:
    print(link)

输出:

https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1001.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1002.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1003.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1004.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1005.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1006.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1007.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1008.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1009.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1010.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1011.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1012.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1013.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1014.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1015.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1016.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1017.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1018.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1019.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1020.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1021.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1022.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1023.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1024.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1025.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1026.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1027.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1028.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1029.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1030.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1031.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1032.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1033.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1034.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1035.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1036.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1037.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1038.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1039.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1040.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1041.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1042.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1043.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1044.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1045.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1046.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1047.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1048.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1049.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1050.jpg

以上是关于使用BeautifulSoup检索图像链接的主要内容,如果未能解决你的问题,请参考以下文章

如何从片段中检索gridview中的图像?

如何使用 CSS 选择器使用 BeautifulSoup 检索位于某个类中的特定链接?

从firebase检索图像以在片段中的回收器视图时出错

使用 beautifulsoup python 调用 onclick 事件

分享前端开发常用代码片段

收藏|分享前端开发常用代码片段