遍历代码行并更改 getElementsByClassName 的整数

Posted

技术标签:

【中文标题】遍历代码行并更改 getElementsByClassName 的整数【英文标题】:Loop through line of code and change integer for getElementsByClassName 【发布时间】:2021-01-13 03:28:00 【问题描述】:

之前在 MrExcel 论坛上发布过

www.mrexcel.com/board/threads/change-integer-in-code-line-for-htmldoc-getelementsbyclassname.1146814/

我原来的代码行是

Set DogRows1 = HTMLDoc.getElementsByClassName("rpb-greyhound rpb-greyhound-1 hover-opacity"

它对整数 1 非常有效。但是,我需要将它增加 1 并更改为其他网页的 2、3、4、5 和 6,如下所示。

Set DogRows1 = HTMLDoc.getElementsByClassName("rpb-greyhound rpb-greyhound-6 hover-opacity"

我尝试声明一些变量并添加一个 For Next 循环,但它不会循环。我究竟做错了什么?我是否将 For Next 循环放在了错误的位置?

Dim StartRaceNumber As Integer
Dim LastRaceNumber As Integer

XMLReq.Open "GET", DogPageURL, False
XMLReq.send

If XMLReq.Status <> 200 Then
    MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
    Exit Sub
End If

HTMLDoc.body.innerhtml = XMLReq.responseText
Set XMLReq = Nothing

LastRaceNumber = 6

For StartRaceNumber = 1 To LastRaceNumber
    Set DogRows1 = HTMLDoc.getElementsByClassName("rpb-greyhound rpb-greyhound-" & StartRaceNumber & " hover-opacity")
    For Each DogRow1 In DogRows1
        Set DogNameLink1 = DogRow1.getElementsByTagName("a")(0)
        NextHref = DogRow1.getAttribute("href")
        NextURL = DogURL & Mid(NextHref, InStr(NextHref, ":") + 28)
        Debug.Print DogRow1.innerText, NextURL
    Next DogRow1
Next StartRaceNumber

【问题讨论】:

可以分享一下网址吗? 【参考方案1】:

我可以确认一下,这只是我需要的每只灰狗的每场比赛页面上的 URL,所以我可以刮掉灰狗的表格。

举个例子:

诺丁汉 11.06

#1 BALLYBOUGH GARY https://www.timeform.com/greyhound-racing/greyhound-form/ballybough-gary/59297

#2 SALACRES BRUISER

https://www.timeform.com/greyhound-racing/greyhound-form/salacres-bruiser/59746

#3 跟随我的领导

https://www.timeform.com/greyhound-racing/greyhound-form/follow-my-lead/54898

#4 荣誉武士

https://www.timeform.com/greyhound-racing/greyhound-form/honour-samurai/53100

#5 NIDDERDALEFLURRY

https://www.timeform.com/greyhound-racing/greyhound-form/nidderdaleflurry/56446

#6 运动旋律

https://www.timeform.com/greyhound-racing/greyhound-form/sporty-melody/58746

我已经开发了一个 Power Query 功能,用于从该 url 页面中抓取表单数据。我只是在努力获取每场比赛的 6x 灰狗形式网址(如上)的完整列表。

这有意义吗?

【讨论】:

【参考方案2】:

确定 SIM 卡

抓取顺序如下:

获取 Greyhound URL 赛卡 Greyhound Races

获取 Greyhound URL 狗信息 List of Greyhounds in the race

获取 Greyhound Form 详细信息,这是 Greyhound#1 的示例 Form of Each Greyhound #1

然后循环到下一场比赛并重复。

正如我所说,我只能从代码中抓取每场比赛的灰狗#1 详细信息的表格。如果你能帮忙,我也需要其他的狗?

这些是我的模块,希望它们已正确导入 >

Option Explicit

Const DogURL As String = "https://www.timeform.com/greyhound-racing/racecards" 子列表DogRace()

Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument

Dim TFRaceList As MSHTML.IHTMLElement
Dim TFRaces As MSHTML.IHTMLElementCollection
Dim TFRace As MSHTML.IHTMLElement

Dim NextHref As String
Dim NextURL As String

XMLReq.Open "GET", DogURL, False
XMLReq.send

If XMLReq.Status <> 200 Then
    MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
    Exit Sub
End If

HTMLDoc.body.innerhtml = XMLReq.responseText
Set XMLReq = Nothing

Set TFRaces = HTMLDoc.getElementsByClassName("wfr-race bg-light-gray hover-opacity")

For Each TFRace In TFRaces

    NextHref = TFRace.getAttribute("href")
    NextURL = DogURL & Mid(NextHref, InStr(NextHref, ":") + 28)    
    ListDogsOnPage TFRace.innerText, NextURL
Next TFRace

结束子 Sub ListDogsOnPage(DogName 作为字符串,DogPageURL 作为字符串)

Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument

Dim DogRow1 As MSHTML.IHTMLElement
Dim DogRows1 As MSHTML.IHTMLElementCollection

Dim DogNameLink1 As MSHTML.IHTMLElement

Dim NextHref As String
Dim NextURL As String

Dim StartRaceNumber As Integer
Dim LastRaceNumber As Integer

XMLReq.Open "GET", DogPageURL, False
XMLReq.send

If XMLReq.Status <> 200 Then
    MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
    Exit Sub
End If

HTMLDoc.body.innerhtml = XMLReq.responseText
Set XMLReq = Nothing

LastRaceNumber = 6

For StartRaceNumber = 1 To LastRaceNumber
    Set DogRows1 = HTMLDoc.getElementsByClassName("rpb-greyhound rpb-greyhound-" & StartRaceNumber & " hover-opacity"
For Each DogRow1 In DogRows1
        Set DogNameLink1 = DogRow1.getElementsByTagName("a")(0)
        NextHref = DogRow1.getAttribute("href")
        NextURL = DogURL & Mid(NextHref, InStr(NextHref, ":") + 28)
        Debug.Print DogRow1.innerText, NextURL
    Next DogRow1
Next StartRaceNumber

结束子

【讨论】:

以上是关于遍历代码行并更改 getElementsByClassName 的整数的主要内容,如果未能解决你的问题,请参考以下文章

PHP MySQL循环遍历行并在日期更改时打印日期

如何使用 jQuery 遍历表行并获取单元格值

Jquery遍历表行并获取第n个子值

Pandas 遍历行并找到列名

Python Pandas 遍历行并访问列名

My SQL with Python:选择具有最高值的行并在那里更改值