不单击所有选项卡并且不循环一次问题

Posted 2023-02-23

技术标签:

【中文标题】不单击所有选项卡并且不循环一次问题【英文标题】：Not clicking all tabs and not looping once issues 【发布时间】：2018-04-10 08:13:16 【问题描述】：

我正在尝试单击网页上的选项卡，如下所示。不幸的是，尽管检查 Chrome 中的 xpath 正确，但它似乎只单击了一些选项卡。我只能假设它没有单击所有选项卡，因为没有使用完整的 xpath。

然而.. 我已尝试更改 xpath：

//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"] 致：

//div[@class='KambiBC-event-groups-list']//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"] 为：

clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.XPATH,'(//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"])[%s]' % str(index + 1))))

但是问题仍然存在。我也尝试过使用 CSS：

#KambiBC-contentWrapper__bottom > div > div > div > div > div.KambiBC-quick-browse-container.KambiBC-quick-browse-container--list-only-mode > div.KambiBC-quick-browse__list.KambiBC-delay-scroll--disabled > div > div.KambiBC-time-ordered-list-container > div.KambiBC-time-ordered-list-content > div > div > div.KambiBC-collapsible-container.KambiBC-mod-event-group-container > header

但是，这一直给我错误... 对于：

clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR,'("#KambiBC-contentWrapper__bottom > div > div > div > div > div.KambiBC-quick-browse-container.KambiBC-quick-browse-container--list-only-mode > div.KambiBC-quick-browse__list.KambiBC-delay-scroll > div > div.KambiBC-time-ordered-list-container > div.KambiBC-time-ordered-list-content > div > div > div > header")[%s]' % str(index + 1))))

应该注意的是，我想单击所有未打开的选项卡，并且我似乎无法使用 CSS 选择器来找到足够具体的元素，因为我认为在这种情况下它不允许您缩小类元素的范围。

有没有办法解决不能点击所有内容的问题？

需要注意的是，我使用的是...

对于索引中的索引：

indexes = [index for index in range(len(options))]
shuffle(indexes)
for index in indexes:

有没有更优雅的方式使用 for 1 循环？

[import sys
sys.exit()][1]

完整的code

【问题讨论】：

您是否尝试添加整个 CSS 选择器以查看它是否会点击？值得注意的是，该问题是由于每次单击未打开的选项卡时值都会更改而引起的。它将寻找元素不存在的元素。请参阅：***.com/questions/48007152/…。不知道如何解决它，但至少确定了根本原因。 【参考方案1】：

这会逐个循环遍历每个联赛的所有比赛，并根据需要收集所有相关数据。您可以通过在每个查询前加上 . 并通过 match.find_element_by_xpath('.//your-query-here') 选择匹配来收集每个匹配项中的更多数据。让我知道这是否成功！

import sys, io, os, csv, requests, time
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from selenium import webdriver

driver = webdriver.Chrome()
driver.set_window_size(1024, 600)
driver.maximize_window()

try:
    os.remove('vtg121.csv')
except OSError:
    pass

driver.get('https://www.unibet.com.au/betting#filter/football')
time.sleep(1)

clickMe = wait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, 
    ('//div[@class="KambiBC-collapsible-container '\
    'KambiBC-mod-event-group-container"]'))))
time.sleep(0)

xp_opened = '//div[contains(@class, "KambiBC-expanded")]'
xp_unopened = '//div[@class="KambiBC-collapsible-container ' \
    'KambiBC-mod-event-group-container" ' \
    'and not(contains(@class, "KambiBC-expanded"))]'
opened = driver.find_elements_by_xpath(xp_opened)
unopened = driver.find_elements_by_xpath(xp_unopened)

data = []
for league in opened:
    xp_matches = './/li[contains(@class,"KambiBC-event-item")]'
    matches = league.find_elements_by_xpath(xp_matches)

    try:
        # League Name
        xp_ln = './/span[@class="KambiBC-mod-event-group-header__main-title"]'
        ln = league.find_element_by_xpath(xp_ln).text.strip()
    except:
        ln = None
    print(ln)

    for match in matches:
        # get all the data per 'match group'
        xp_team1_name = './/button[@class="KambiBC-mod-outcome"][1]//' \
            'span[@class="KambiBC-mod-outcome__label"]'
        xp_team1_odds = './/button[@class="KambiBC-mod-outcome"][1]//' \
            'span[@class="KambiBC-mod-outcome__odds"]'
        xp_team2_name = './/button[@class="KambiBC-mod-outcome"][3]//' \
            'span[@class="KambiBC-mod-outcome__label"]'
        xp_team2_odds = './/button[@class="KambiBC-mod-outcome"][3]//' \
            'span[@class="KambiBC-mod-outcome__odds"]'

        try:
            team1_name = match.find_element_by_xpath(xp_team1_name).text
        except:
            team1_name = None

        try:
            team1_odds = match.find_element_by_xpath(xp_team1_odds).text
        except:
            team1_odds = None

        try:
            team2_name = match.find_element_by_xpath(xp_team2_name).text
        except:
            team2_name = None

        try:
            team2_odds = match.find_element_by_xpath(xp_team2_odds).text
        except:
            team2_odds = None

        data.append([ln, team1_name, team1_odds, team2_name, team2_odds])

for league in unopened:
    league.click()
    time.sleep(0.5)
    matches = league.find_elements_by_xpath(xp_matches)

    try:
        ln = league.find_element_by_xpath(xp_ln).text.strip()
    except:
        ln = None
    print(ln)

    for match in matches:
        try:
            team1_name = match.find_element_by_xpath(xp_team1_name).text
        except:
            team1_name = None

        try:
            team1_odds = match.find_element_by_xpath(xp_team1_odds).text
        except:
            team1_odds = None

        try:
            team2_name = match.find_element_by_xpath(xp_team2_name).text
        except:
            team2_name = None

        try:
            team2_odds = match.find_element_by_xpath(xp_team2_odds).text
        except:
            team2_odds = None

        data.append([ln, team1_name, team1_odds, team2_name, team2_odds])

with open('vtg121.csv', 'a', newline='', encoding="utf-8") as outfile:
    writer = csv.writer(outfile)
    for row in data:
        writer.writerow(row)
        print(row)

【讨论】：

你测试了吗？它点击了几个元素然后给出了 raise TimeoutException(message, screen, stacktrace) 看看：pastebin.com/XznK0gKP 查看更新，现在更清楚为什么会发生这种情况。不知道如何解决，但是已经确定了不点击标签的确切原因。 @user9145009 看看我的编辑，我调整了解决方案，以便您从每个组中获取所有数据。【参考方案2】：

OP's code without extra imports

发生错误是因为site's XPaths to tabs OP 想要的不是连续的。它有一个缺口。例如，现在我找不到

//*[@id="KambiBC-contentWrapper__bottom"]/div/div/div/div/div[3]/div1/div/div[3]/div[2]/div/div /div[2]/header

前一阵子游戏上线之前，我找不到

//*[@id="KambiBC-contentWrapper__bottom"]/div/div/div/div/div[3]/div1/div/div[3]/div[2]/div/div /div[1]/header

当我谈到index 时，我指的是上面的粗体部分。

当游戏上线时，选项卡突然将索引从 2 变为 1。（粗体部分发生了变化。）在这两种情况下，都存在差距：要么找不到 1，要么找不到 2。

我猜，有差距的原因是因为中间有另一个不可点击的元素。见下图。

league 是造成差距的原因。 因此，每当代码到达league 占用的索引时，就会超时。因为League 按钮和其他选项卡切换League 和现场比赛的位置，所以当位置变化时，索引会交换。（我认为这就是为什么我首先找不到粗体部分为 1 的 Xpath，后来又找不到为 2 的原因。）

以下是 OP 的部分代码。最后可以看到，str(index + 1)。

indexes = [index for index in range(len(options))] # 
shuffle(indexes) # the OP use shuffle from random. Still 0 and 1 is contained.
path = '(//div[@class="KambiBC-collapsible-container KambiBC-mod-event-group-container"])'
for index in indexes:
    # Because there are some indexes are missing because of League button,
    # nothing can be found at the index and it times out.
    clickMe = wait(driver, 10).until(
EC.element_to_be_clickable((By.XPATH, path + '[%s]' % str(index + 1))))

解决方案

尝试捕获超时异常以跳过League 占用的索引。您还可以保留一个计数器以仅允许在一页上捕获一个超时异常。如果有第二次超时，您就知道除了League 按钮之外还有其他问题，应该停止。

from selenium import webdriver 
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
driver = webdriver.Firefox()
driver.set_window_size(1024, 600)
driver.maximize_window()
wait = WebDriverWait 

driver.get('https://www.unibet.com.au/betting#filter/football')
time.sleep(5)

options = driver.find_elements_by_xpath("""//*[@id="KambiBC-contentWrapper__bottom"]/div/div/div/div/div[3]/div[1]/div/div[3]/div[2]/div/div/div""")

print("Total tabs that we want to open is ".format(len(options)))
indexes = [index for index in range(len(options))]

for index in indexes:
    print(index)
    try:
        clickMe = wait(driver, 5).until(EC.presence_of_element_located((By.XPATH,
            """//*[@id="KambiBC-contentWrapper__bottom"]/div/div/div/div/div[3]/div[1]/div/div[3]/div[2]/div/div/div[]/header""".format(str(index+1)))))
        clickMe.click()
    except TimeoutException as ex:
        print("catch you! ".format(index))
        pass

【讨论】：

这是一个很好的发现，谢谢。我添加了 timeoutexception 并添加了 index + 1 但它仍然没有单击所有内容。该页面实际上随机化了加载时将经常打开的选项卡。 pastebin.com/H3x8VVYD. 也许我们可以在代码防弹之前尽量不要随机化？如果它是随机的，将很难调试。现在是什么行为？还是超时？其实循环的范围不需要加1。附加的代码点击每个选项卡。不幸的是（重新测试并注意一些未单击的选项卡）。我相信一种解决方法是使用： driver.execute_script("arguments[0].click();", clickMe) 但由于某些奇怪的原因使用索引，不起作用。但是，如果我使用单个 css，那么它就可以工作。哪些未被点击？我的电脑还是可以的。

以上是关于不单击所有选项卡并且不循环一次问题的主要内容，如果未能解决你的问题，请参考以下文章