在类中刮取一个类

Posted 2023-02-23

技术标签:

【中文标题】在类中刮取一个类【英文标题】：Scrape a class within a class 【发布时间】：2017-10-12 23:46:18 【问题描述】：

我想在class_="_e4d" 中抓取class_="href"。基本上是想用 BeautifulSoup 在一个类中抓取一个类。

from bs4 import BeautifulSoup
import selenium.webdriver as webdriver

url = ("https://www.google.com/search?...")

def get_related_search(url):
    driver = webdriver.Chrome("C:\\Users\\John\\bin\\chromedriver.exe")
    driver.get(url)
    soup = BeautifulSoup(driver.page_source)
    relate_result = soup.find_all("p", class_="_e4b")
    return relate_result[0]

relate_url = get_related_search(url)
print(relate_url)

结果：markup_type=markup_type)) p class="_e4b"a href="/search?...a/p

我现在想抓取 href 结果。我不确定下一步会是什么。谢谢您的帮助。

注意：我将替换为，因为它没有显示为 html 脚本

【问题讨论】：

【参考方案1】：

您实际上可以使用CSS selector 一次性找到这个内部a 元素：

links = soup.select("p._e4b a[href]")
for link in links:
    print(link['href'])

p._e4b a[href] 将在具有_e4b 类的p 元素内定位所有具有href 属性的a 元素。

【讨论】：

以上是关于在类中刮取一个类的主要内容，如果未能解决你的问题，请参考以下文章