使用BeautifulSoup检索图像链接
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了使用BeautifulSoup检索图像链接相关的知识,希望对你有一定的参考价值。
我有以下html代码,我想提取data-img
链接。
<a rel="popover" data-img="https://www.johnpyelots.co.uk/Sales/Sale -
4709/Pictures/1001.jpg" href="lot_details.asp?
l=1&lotid=3357969&pageno=1" class="blacklink2" data-original-
title=""><img border="0" src="https://www.johnpyelots.co.uk/Sales/Sale -
4709/Thumbnails/thumb_1001.jpg" width="70" <="" a=""></a>
我使用以下Python代码,但我似乎无法提取链接:
urldes = "https://www.johnpyeauctions.co.uk/lot_list.asp?saleid=4709&siteid=1"
# add header
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36'}
r = requests.get(urldes, headers=headers)
soup = BeautifulSoup(r.content, "lxml")
mylinks = []
for link in soup.find_all('a'):
mylinks.append(link['data-image'])
for i in range(len(mylinks)):
mylinks[i]
mylinks_0 = mylinks[0]
有任何想法吗?
答案
网页上有许多链接都在<a>
标签内。因此,如果您只想要具有图像链接的<a>
标记,则需要为find_all()
方法指定更多参数。
另外,看看List Comprehensions in Python。
import requests
from bs4 import BeautifulSoup
r = requests.get('https://www.johnpyeauctions.co.uk/lot_list.asp?saleid=4709&siteid=1')
soup = BeautifulSoup(r.text, 'lxml')
image_links = [x['data-img'] for x in soup.find_all('a', rel='popover')]
for link in image_links:
print(link)
输出:
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1001.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1002.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1003.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1004.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1005.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1006.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1007.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1008.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1009.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1010.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1011.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1012.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1013.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1014.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1015.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1016.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1017.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1018.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1019.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1020.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1021.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1022.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1023.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1024.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1025.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1026.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1027.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1028.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1029.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1030.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1031.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1032.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1033.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1034.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1035.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1036.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1037.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1038.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1039.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1040.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1041.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1042.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1043.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1044.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1045.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1046.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1047.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1048.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1049.jpg
https://www.johnpyelots.co.uk/Sales/Sale - 4709/Pictures/1050.jpg
以上是关于使用BeautifulSoup检索图像链接的主要内容,如果未能解决你的问题,请参考以下文章
如何使用 CSS 选择器使用 BeautifulSoup 检索位于某个类中的特定链接?