遍历代码行并更改 getElementsByClassName 的整数
Posted
技术标签:
【中文标题】遍历代码行并更改 getElementsByClassName 的整数【英文标题】:Loop through line of code and change integer for getElementsByClassName 【发布时间】:2021-01-13 03:28:00 【问题描述】:之前在 MrExcel 论坛上发布过
www.mrexcel.com/board/threads/change-integer-in-code-line-for-htmldoc-getelementsbyclassname.1146814/
我原来的代码行是
Set DogRows1 = HTMLDoc.getElementsByClassName("rpb-greyhound rpb-greyhound-1 hover-opacity"
它对整数 1 非常有效。但是,我需要将它增加 1 并更改为其他网页的 2、3、4、5 和 6,如下所示。
Set DogRows1 = HTMLDoc.getElementsByClassName("rpb-greyhound rpb-greyhound-6 hover-opacity"
我尝试声明一些变量并添加一个 For Next 循环,但它不会循环。我究竟做错了什么?我是否将 For Next 循环放在了错误的位置?
Dim StartRaceNumber As Integer
Dim LastRaceNumber As Integer
XMLReq.Open "GET", DogPageURL, False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
Exit Sub
End If
HTMLDoc.body.innerhtml = XMLReq.responseText
Set XMLReq = Nothing
LastRaceNumber = 6
For StartRaceNumber = 1 To LastRaceNumber
Set DogRows1 = HTMLDoc.getElementsByClassName("rpb-greyhound rpb-greyhound-" & StartRaceNumber & " hover-opacity")
For Each DogRow1 In DogRows1
Set DogNameLink1 = DogRow1.getElementsByTagName("a")(0)
NextHref = DogRow1.getAttribute("href")
NextURL = DogURL & Mid(NextHref, InStr(NextHref, ":") + 28)
Debug.Print DogRow1.innerText, NextURL
Next DogRow1
Next StartRaceNumber
【问题讨论】:
可以分享一下网址吗? 【参考方案1】:我可以确认一下,这只是我需要的每只灰狗的每场比赛页面上的 URL,所以我可以刮掉灰狗的表格。
举个例子:
诺丁汉 11.06
#1 BALLYBOUGH GARY https://www.timeform.com/greyhound-racing/greyhound-form/ballybough-gary/59297
#2 SALACRES BRUISER
https://www.timeform.com/greyhound-racing/greyhound-form/salacres-bruiser/59746
#3 跟随我的领导
https://www.timeform.com/greyhound-racing/greyhound-form/follow-my-lead/54898
#4 荣誉武士
https://www.timeform.com/greyhound-racing/greyhound-form/honour-samurai/53100
#5 NIDDERDALEFLURRY
https://www.timeform.com/greyhound-racing/greyhound-form/nidderdaleflurry/56446
#6 运动旋律
https://www.timeform.com/greyhound-racing/greyhound-form/sporty-melody/58746
我已经开发了一个 Power Query 功能,用于从该 url 页面中抓取表单数据。我只是在努力获取每场比赛的 6x 灰狗形式网址(如上)的完整列表。
这有意义吗?
【讨论】:
【参考方案2】:确定 SIM 卡
抓取顺序如下:
获取 Greyhound URL 赛卡 Greyhound Races
获取 Greyhound URL 狗信息 List of Greyhounds in the race
获取 Greyhound Form 详细信息,这是 Greyhound#1 的示例 Form of Each Greyhound #1
然后循环到下一场比赛并重复。
正如我所说,我只能从代码中抓取每场比赛的灰狗#1 详细信息的表格。如果你能帮忙,我也需要其他的狗?
这些是我的模块,希望它们已正确导入 >
Option Explicit
Const DogURL As String = "https://www.timeform.com/greyhound-racing/racecards" 子列表DogRace()
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim TFRaceList As MSHTML.IHTMLElement
Dim TFRaces As MSHTML.IHTMLElementCollection
Dim TFRace As MSHTML.IHTMLElement
Dim NextHref As String
Dim NextURL As String
XMLReq.Open "GET", DogURL, False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
Exit Sub
End If
HTMLDoc.body.innerhtml = XMLReq.responseText
Set XMLReq = Nothing
Set TFRaces = HTMLDoc.getElementsByClassName("wfr-race bg-light-gray hover-opacity")
For Each TFRace In TFRaces
NextHref = TFRace.getAttribute("href")
NextURL = DogURL & Mid(NextHref, InStr(NextHref, ":") + 28)
ListDogsOnPage TFRace.innerText, NextURL
Next TFRace
结束子 Sub ListDogsOnPage(DogName 作为字符串,DogPageURL 作为字符串)
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim DogRow1 As MSHTML.IHTMLElement
Dim DogRows1 As MSHTML.IHTMLElementCollection
Dim DogNameLink1 As MSHTML.IHTMLElement
Dim NextHref As String
Dim NextURL As String
Dim StartRaceNumber As Integer
Dim LastRaceNumber As Integer
XMLReq.Open "GET", DogPageURL, False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
Exit Sub
End If
HTMLDoc.body.innerhtml = XMLReq.responseText
Set XMLReq = Nothing
LastRaceNumber = 6
For StartRaceNumber = 1 To LastRaceNumber
Set DogRows1 = HTMLDoc.getElementsByClassName("rpb-greyhound rpb-greyhound-" & StartRaceNumber & " hover-opacity"
For Each DogRow1 In DogRows1
Set DogNameLink1 = DogRow1.getElementsByTagName("a")(0)
NextHref = DogRow1.getAttribute("href")
NextURL = DogURL & Mid(NextHref, InStr(NextHref, ":") + 28)
Debug.Print DogRow1.innerText, NextURL
Next DogRow1
Next StartRaceNumber
结束子
【讨论】:
以上是关于遍历代码行并更改 getElementsByClassName 的整数的主要内容,如果未能解决你的问题,请参考以下文章