div的BeautifulSoup子项

Posted 2023-02-23

技术标签:

【中文标题】div的BeautifulSoup子项【英文标题】：BeautifulSoup children of div 【发布时间】：2020-09-20 21:29:14 【问题描述】：

我正在尝试使用 beautifulSoup 在网站上查找特定 div 的子 div。

我从这个答案中得到启发：Beautiful Soup find children for particular div

但是，当我想检索具有 class='row' 的 div 的所有内容时，其父 div 具有 class="container search-results-wrapper never_page_template" 如下所示： My问题是它只检索第一个 div class='row' 的内容。

我正在使用以下代码：

    boatContainer= page_soup.find_all('div', class_='container search-results-wrapper endless_page_template')
    for row in boatContainer:
        all_boats = row.find_all('div', class_='row')
        for boat in all_boats:
           print(boat.text)

我将此应用于website。我该怎么做才能让我的解决方案从属于 div class='container search-results-wrapper never_page_template' 的 class='row' 中检索 div 的数据？

【问题讨论】：

这能回答你的问题吗？ How to find children of nodes using BeautifulSoup 不幸的是，它在我想使用的网站上做同样的事情。它只显示第一个 div 行的文本内容在我的机器上完全按照预期工作，你确定你的初始汤是正确的吗？我编辑了我的问题，所以你可以看看 @colla 你能告诉我你对从这个网站上抓取哪些数据特别感兴趣吗？不用说，我已经读过你的问题好几遍了，但还是不能正面或反面它。 【参考方案1】：

使用response.content 代替response.text。

您也没有在代码中请求正确的 url。 https://www.sailogy.com/en/search/?search_where=ibiza&trip_date=2020-06-06&weeks_count=1&skipper=False&search_src=home 只显示一条船，因此您的代码只返回一行。

在这种情况下请改用https://www.sailogy.com/en/search/?search_where=ibiza&trip_date=2020-06-06&weeks_count=1&guests_count=&order_by=-rank&is_roundtrip=&coupon_code=&skipper=None

您可能会在某些时候发现调整 url 参数以过滤船的用途！

【讨论】：

已根据您的代码未按预期执行的可疑原因进行了调整

以上是关于div的BeautifulSoup子项的主要内容，如果未能解决你的问题，请参考以下文章